oracle / python-oracledb

Python driver for Oracle Database conforming to the Python DB API 2.0 specification. This is the renamed, new major release of cx_Oracle
https://oracle.github.io/python-oracledb
Other
307 stars 59 forks source link

Closing connection causes AttributeError: 'NoneType' object has no attribute 'send' #292

Open intgr opened 4 months ago

intgr commented 4 months ago

I recently upgraded to Django 5.0, which also means switching from cx_Oracle to oracledb.

Sometimes when Django attempts to close the database connection, it could fail with the following error. Stack trace below. I've seen this pop up in two different contexts so far. This example is when running a custom Django manage.py command.

AttributeError: 'NoneType' object has no attribute 'send'


  1. What versions are you using?
platform.platform: Linux-4.18.0-477.27.1.el8_8.x86_64-x86_64-with-glibc2.36
sys.maxsize > 2**32: True
platform.python_version: 3.11.7
oracledb.__version__: 2.0.1
Django: 5.0.1
  1. Is it an error or a hang or a crash?

Crash

  1. What error(s) or behavior you are seeing?
AttributeError: 'NoneType' object has no attribute 'send'
  File "<string>", line 1, in <module>
  File "manage.py", line 23, in main
    execute_from_command_line(sys.argv)
  File "django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "django/core/management/__init__.py", line 436, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "django/core/management/base.py", line 425, in run_from_argv
    connections.close_all()
  File "django/utils/connection.py", line 85, in close_all
    conn.close()
  File "django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
  File "django/db/backends/base/base.py", line 351, in close
    self._close()
  File "django/db/backends/base/base.py", line 309, in _close
    return self.connection.close()
  File "oracledb/connection.py", line 621, in close
    self._impl.close()
  File "src/oracledb/impl/thin/connection.pyx", line 427, in oracledb.thin_impl.ThinConnImpl.close
  File "src/oracledb/impl/thin/protocol.pyx", line 123, in oracledb.thin_impl.Protocol._close
  File "src/oracledb/impl/thin/protocol.pyx", line 158, in oracledb.thin_impl.Protocol._close
  File "src/oracledb/impl/thin/protocol.pyx", line 386, in oracledb.thin_impl.Protocol._process_message
  File "src/oracledb/impl/thin/protocol.pyx", line 363, in oracledb.thin_impl.Protocol._process_message
  File "src/oracledb/impl/thin/messages.pyx", line 312, in oracledb.thin_impl.Message.send
  File "src/oracledb/impl/thin/packet.pyx", line 654, in oracledb.thin_impl.WriteBuffer.end_request
  File "src/oracledb/impl/thin/packet.pyx", line 627, in oracledb.thin_impl.WriteBuffer._send_packet
  File "src/oracledb/impl/thin/transport.pyx", line 327, in oracledb.thin_impl.Transport.write_packet
  1. Does your application call init_oracle_client()?

I am using thin mode.

  1. Include a runnable Python script that shows the problem.

I haven't been able to reproduce this in isolation. If the above stack trace isn't enough to troubleshoot the issue, please let me know and I can try if I can figure it out.

anthony-tuininga commented 4 months ago

Are you able to reproduce it consistently in your environment? It is possible that multiple concurrent attempts to close the connection are being made -- but my attempts to reproduce have been unsuccessful and a close look of the code suggests that is not the source of the issue either. The call stack shows that it is failing during the actual sending of the LogOff message. If you are able to reproduce consistently can make the following changes and build from source?

index b9c7e52..14660cb 100644
--- a/src/oracledb/impl/thin/protocol.pyx
+++ b/src/oracledb/impl/thin/protocol.pyx
@@ -126,6 +126,10 @@ cdef class Protocol(BaseProtocol):
             if self._read_buf._transport is None:
                 self._transport = None

+            if self._transport is not None \
+                    and self._transport._transport is None:
+                self._force_close()
+
             # if the session was marked as needing to be closed, force it
             # closed immediately (unless it was already closed)
             if self._read_buf._session_needs_to_be_closed \
intgr commented 4 months ago

Now that I have more log data to work with, I have a better idea what's going on. Unfortunately it only happens in one production environment and I would avoid experimenting there unless absolutely necessary.

It is possible that multiple concurrent attempts to close the connection are being made

Django database connections are per-thread, and I don't believe my application is sharing connections between threads.

But I noticed that this error only occurs in Celery processes. We're using Celery's default worker pool, which launches new worker processes with Unix fork(). Which means after the time of fork, two processes share the same connection file descriptor. That's a recipe for surprises for sure.

Celery has some logic to try to solve it -- it attempts to directly close DB file descriptors in the child process, to prevent multiple processes from trying to send on the same socket:

However, since oracledb Connection does not implement fileno() method, this part is skipped, and it calls the underlying Connection.close() method next.


Not exactly sure what the right solution here would be even. Maybe oracledb can implement fileno() so users have a chance of working around this issue. Or oracledb could somehow detect forking when process PID changes

Two more related stack traces, from worker side (child processes of the fork) ``` Exception ignored in: Traceback (most recent call last): File "site-packages/oracledb/connection.py", line 541, in __del__ self._impl.close(in_del=True) File "src/oracledb/impl/thin/connection.pyx", line 427, in oracledb.thin_impl.ThinConnImpl.close File "src/oracledb/impl/thin/protocol.pyx", line 123, in oracledb.thin_impl.Protocol._close File "src/oracledb/impl/thin/protocol.pyx", line 158, in oracledb.thin_impl.Protocol._close File "src/oracledb/impl/thin/protocol.pyx", line 386, in oracledb.thin_impl.Protocol._process_message File "src/oracledb/impl/thin/protocol.pyx", line 363, in oracledb.thin_impl.Protocol._process_message File "src/oracledb/impl/thin/messages.pyx", line 312, in oracledb.thin_impl.Message.send File "src/oracledb/impl/thin/packet.pyx", line 654, in oracledb.thin_impl.WriteBuffer.end_request File "src/oracledb/impl/thin/packet.pyx", line 627, in oracledb.thin_impl.WriteBuffer._send_packet File "src/oracledb/impl/thin/transport.pyx", line 327, in oracledb.thin_impl.Transport.write_packet ``` ``` ERROR Signal handler > raised: AttributeError("'NoneType' object has no attribute 'send'") Traceback (most recent call last): File "site-packages/celery/utils/dispatch/signal.py", line 276, in send response = receiver(signal=self, sender=sender, **named) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "site-packages/celery/fixups/django.py", line 167, in on_worker_process_init self._close_database(force=True) File "site-packages/celery/fixups/django.py", line 199, in _close_database conn.close() File "site-packages/django/utils/asyncio.py", line 26, in inner return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "site-packages/django/db/backends/base/base.py", line 351, in close self._close() File "site-packages/django/db/backends/base/base.py", line 309, in _close return self.connection.close() ^^^^^^^^^^^^^^^^^^^^^^^ File "site-packages/oracledb/connection.py", line 621, in close self._impl.close() File "src/oracledb/impl/thin/connection.pyx", line 427, in oracledb.thin_impl.ThinConnImpl.close File "src/oracledb/impl/thin/protocol.pyx", line 123, in oracledb.thin_impl.Protocol._close File "src/oracledb/impl/thin/protocol.pyx", line 158, in oracledb.thin_impl.Protocol._close File "src/oracledb/impl/thin/protocol.pyx", line 386, in oracledb.thin_impl.Protocol._process_message File "src/oracledb/impl/thin/protocol.pyx", line 363, in oracledb.thin_impl.Protocol._process_message File "src/oracledb/impl/thin/messages.pyx", line 312, in oracledb.thin_impl.Message.send File "src/oracledb/impl/thin/packet.pyx", line 654, in oracledb.thin_impl.WriteBuffer.end_request File "src/oracledb/impl/thin/packet.pyx", line 627, in oracledb.thin_impl.WriteBuffer._send_packet File "src/oracledb/impl/thin/transport.pyx", line 327, in oracledb.thin_impl.Transport.write_packet AttributeError: 'NoneType' object has no attribute 'send' ```
anthony-tuininga commented 4 months ago

Ah, in that case can you put together a test case using Celery that demonstrates the problem?

It seems to me, though, that it would be better for Django to close the connections before the process is forked. That would avoid these sorts of issues. I know that there is no support (in Oracle) for attempting to do what you are doing here. That said, if you can find a test case to replicate I can see what can be done to avoid the situation if possible.

cjbj commented 2 months ago

@intgr is there any update on this?

vigneshwarrvenkat commented 1 month ago

@intgr Any update on this issue.. I'm facing the same issue. Is there any workaround?

anthony-tuininga commented 1 month ago

@vigneshwarrvenkat, are you able to put together a test case that demonstrates the issue? Are you using the latest version of python-oracleb (2.2)?