Closed jonadaly closed 5 years ago
I'm not sure - I don't understand how django/wsgi/neo4j is working at that level unfortunately!
@robertlagrant Is that as a standalone or through Django? Can you talk a little bit more about the problem?
@robertlagrant Thank you very much for this note, can I please ask you to create a new "issue" for that if any additional work is required? Once you have done this, please erase this post to try and keep things organised.
As mentioned previously, 3.3.0 is not only about the generalisation fix.
@aanastasiou sure. I'm not sure about extra work, but just mentioning that anyone using neomodel against a clustered neo4j might currently be having a very bad time, as they potentially can't use 3.2.9 or 3.3.0 :-)
I'm happy to have a look at the neomodel code, if that'd help speed things up. Could you expand on the reasons you mentioned above for why the model registry and connection info should be thread-local?
@robertlagrant Then it sounds that a new issue should be created. Can you please do this and remove the previous comment to keep this thread focused on 378? (EDIT: It would be interesting to see a description of the "cluster problem" because I have used neomodel
in a clustered environment without problems but that was around the time neo4j 3.4.7 or 3.4.8 was around and I was only doing parallel reads. So, as you can see, the devil is in the details which I would be very interested in finding out more about.)
In brief, the situation (about issue 378) is described further below but also, if you scroll back in this discussion you will find additional details about specific issues that I might omit in the following general description:
All the calls from anywhere within an application that uses neomodel
are routed via one Database
object that is created during module instantiation.
Part of the responsibilities of this object is transaction handling.
Database
is "tied" to the process ID.Neomodel 3.3.0 introduced state to the Database
object via a node-class registry that is used to resolve a neo4j Node object to a local datamodel object.
Within the context of a single-process (and potentially even multi-threaded) application, state is preserved and everything works as expected.
But, with WSGI firing a new process to serve requests, the neomodel.Database
object that is instantiated during application creation is not the same object that is attempted to be accessed during the serving of a request.
Database
attribute that is responsible for the object resolution
neomodel
. For example, in the case of Flask, it is possible to provide application wide access to a specific object via an extension.neomodel
.This is where we are at the moment. If you would like to help, I would appreciate it if you provided web-framework specific views on how to deal with preserving state, especially for Django. At the moment I am trying to see how to do this with Flask
and @mjmare has looked into Pyramid (but I expect that it has a similar mechanism).
There is a bit more work that has taken place in the background, you can see gists and repos further up but overall this is where we are at the moment.
All the best
@jonadaly @manoadamro @theterg @mzgajner @mostafa @mjmare @MardanovTimur @phoebebright @robertlagrant @robinedwards
Here is a minimal Gist to show:
The way neomodel.db
state must be preserved between application creation and serving of the request
The basic skeleton of what will later become something like a flask-neomodel-manager
extension for flask. I expect that something similar would have to be created separately for Django, Pyramid and other frameworks if in the meantime we have not come up with a better alternative.
All the best
@aanastasiou AFAIK in Django you have a DatabaseWrapper
object that is thread specific that manages the connection to the database. Each new thread (in development each request is handled in a new thread) or process spawned by some threading/process manager will recreate that DatabaseWrapper
with a ConnectionManager
inside.
The schema definition is the Django Model definition itself (as created by the user and ensured consistency with the schema in the database by schema migrations once during server startup) and a ModelManager
will use the DatabaseWrapper
to (eventually lazy just-in-time) create a (new) connection to the database and fire queries using a Cursor
. I.o.w. Schema and Connection are disconnected from each other and not dependent. Once the main process is instantiated is assumes a consistent database against the modelschema.
One may choose to re-use the connection, but that is most of the time not the case (never in development, optional in production) to reduce the number of simultaneous connections 'open' to the database. Better to open, do transaction, close. Is neomodel assuming a single open connection?
I am curious to see how neomodel
manages this as we experience the same problems with 3.3.0 as anyone else. Will make some effort to do testing.
@jberends
AFAIK in Django you have a DatabaseWrapper object that is thread specific that manages the connection to the database....
My question was specifically about how Django's middleware uses neomodel
. However, from what I see here, this is not very far off from the way neomodel is set up as well. The only problem is maintaining the state of the neomodel.db
object (more about this further below).
Is neomodel assuming a single open connection?
I am assuming you mean one connection that needs to remain connected to the server. If that is the case, then no.
I am curious to see how
neomodel
manages this as we experience the same problems with 3.3.0 as anyone else. Will make some effort to do testing.
Please see this gist which is also linked in my last post. This is a very crude outline of how the problem is dealt with in Flask and it shows how to "save" the state between calls.
I have not looked any further details into this but I am fairly confident that applications can be isolated from each other so that it is possible to run different applications on the same server, all using neomodel but over different connections and schemata.
As mentioned above, I think that this is going to have to be solved on a "per-framework" basis. Django has its own middleware infrastructure so does Pyramid and other frameworks.
I think that it would help if there was a repository that demonstrated how Django models integrate with neomodel
(which provides its own models. Do you inherit from both? Do you build Django wrappers around neomodel
classes to "force" it to talk to the database via neomodel
?). We could then use that one to discuss ways forward. If you have experience with Django middleware and would like to create one for neomodel, even better.
All the best
@aanastasiou two stupid questions: 1) we use waitress, which doesn't make new processes, so why would we be affected? 2) with WSGI servers that do create a new process, don't they either a) have to reinitialize everything from scratch, including the imports, or b) already have the memory initialised? (or c) you don't understand, Rob, and here's what you're missing)
@robertlagrant Thanks for your comment. There are no stupid questions. I do not have a problem with accepting any misunderstanding on my part and I would be glad to get any deeper insight by anyone who would like to offer it. (Rob?)
I have not used waitress
. Do you think you can provide a waitress
based counterexample of this situation where the neomodel.db
object's state is preserved but this still leads to problems?
a) Yes they do, you might want to see this and this on how this fails. I appreciate that this thread is getting too long by now but it does contain quite a few details on the problem.
@aanastasiou I have created a scaffold project for django and neomodel which is able to re-create the conditions between 3.2.9 and 3.3.0.
The repo can be found here: https://github.com/jberends/django_neo_3.3.0 Guide to install can be found here: https://github.com/jberends/django_neo_3.3.0/blob/master/README.md
Indeed it seems that the _NODE_CLASS_REGISTRY is not populated inside the request stream. I will add some notes in the readme to indicate where the neomodel specific configuration is located.
I use Pycharm and I included a 'run' command that may be used to run/debug the server once loaded.
What I found out is:
neomodel/core.py
the db = Database()
object does not really seem to be a singleton as indicated in the Database.__init__()
. Each thread does import neomodel and hence re-create (and re-__init__
) a Database object. While the NodeMeta
is only called once in Django during initialisation of the classes. That db
object is properly populated with a _NODE_CLASS_REGISTRY
, but thread-local Database objects not (re-set to {}
in the __init__
of the Database
object in each thread). Do you concur with this analysis? Isn't then a possible solution a more robust way of creating a singleton of the Database()
object?I am sure it is the design of the 'singleton' of the Database object. The object definition of the Database
class is subclassed from threading.local
(py2, py3). Which states:
class threading.local¶
A class that represents thread-local data. Thread-local data are data whose values are thread specific. To manage thread-local data, just create an instance of local (or a subclass) and store attributes on it: [...] The instance’s values will be different for separate threads.
Once I subclass from object
, it works. And while the Database
object only contains 'glue' information and does not hold data, it could be a solution in a multi threaded environment (also multi process as each process itself re-inits the django world on its own and have their own process-local Database
object with correct parameters.)
A patch would look like:
--- neomodel/util.py
+++ neomodel/util.py
@@ -42,7 +42,7 @@
db.cypher_query("MATCH (a) DETACH DELETE a")
-class Database(local):
+class Database(object):
def __init__(self):
self._active_transaction = None
self.url = None
@jberends Thank you very much for this, the respository is very useful anyway. Regarding the rest of it, it is as you describe it. The Database
derives from local
and the problem emerges once there is instantiation from different processes.
The bit that is going to be complicated then is this one.
In other words, multi-threading application transactions are pushed under the same "unit of work" but across processes a new connection is created.
This change would mean that even multi-thread calls would create a new connection. Does anyone see any problems with that (?)
Edit: @jberends Just to be clear about something: When people have been talking about integration with Django so far, I assumed that it was about getting Django models to work together with neomodel
. That is, having a Django User
stored via neomodel
, The repository you created seems to be using neomodel
separately, to simply access a Graph database.
@jberends Thank you very much for this, the respository is very useful anyway. Regarding the rest of it, it is as you describe it. The
Database
derives fromlocal
and the problem emerges once there is instantiation from different processes.
Even 'threads' re-instantiate the Database object, not only processes.
The bit that is going to be complicated then is this one. In other words, multi-threading application transactions are pushed under the same "unit of work" but across processes a new connection is created. This change would mean that even multi-thread calls would create a new connection. Does anyone see any problems with that (?)
This might be tricky indeed. If I understand correctly:
Database
instance that is shared across threads;Database.begin()/end()/rollback()
methods.If this is the case, one would think to separate the concept of storing the 'Model-metadata
from the database' and a Connection
to the database (which normally can even open and is closed each transaction (eg for postgres) to limit amount of connections to the database). If I see correctly the driver.session
is the connection object that is used to execute cypher.
So maybe the Database
object can be subclassed from threading.local
hower the _NODE_CLASS_REGISTRY
should live in a proper-proper singleton.
Conclusion: Factor out the NODE_CLASS_REGISTRY
(in short: NCR
) into its own separate singleton object, and do not change the rest of the Database
object. The Database
object can then access the NCR
itself in 'read' mode. Other functions that populate the NCR
should then populate that singleton, through its interface.
Edit: @jberends Just to be clear about something: When people have been talking about integration with Django so far, I assumed that it was about getting Django models to work together with
neomodel
. That is, having a DjangoUser
stored vianeomodel
, The repository you created seems to be usingneomodel
separately, to simply access a Graph database.
Sure, I get that. However neomodel does already provide a nice way to define the model and deal with NodeSet results. A more integrated solution with more 'django-feel' would help traction of adoption of neo4j in django-land IMHO, but that is more for django-neomodel
to pick up on.
I made a small POC implementation on https://github.com/KE-works/neomodel/tree/feature_class_registry_singleton and that works, except 2 tests that alter the registry directly. I did not have time to fix those. This is an ugly POC using globals to provide a dict. A class implementation may be wiser.
Also credits to my collegue @oikonomou with which I had an extensive discussion on the topic. :handshake:
@jberends
If this is the case, one would think to separate the concept of storing the
Model-metadata
from the database' and aConnection
to the database (which normally can even open and is closed each transaction (eg for postgres) to limit amount of connections to the database). If I see correctly the driver.session is the connection object that is used to execute cypher.
Well, a bit more work has taken place around this and a "batched" operations feature (but if possible via multiple units of work and batches of units of work). I have left it as a long term TODO because I was trying to solve it via Database()
. The current form is convenient because all Node/Rel calls have to go through the Database
independently of where they "live". When you create a StructuredNode
, you don't have to specify that its calls should go via a specific Database
. If you start creating multiple Database
objects, you are going to have to have a way of specifying that you want a particular StructuredNode
to route its calls through a particular Database
object. This is one thing. The other thing is exposing information of one application from another if there is one massive registry devoted to all instances. It needs a little bit more thinking and this is why, personally, I have left it as a longer term TODO. So, thank you for this note, as my thinking was not going through meta-classes at all.
By the way, if in the meantime someone comes up with a meta-class based solution faster, please submit a pull request.
Factor out the
NODE_CLASS_REGISTRY
...
Alright, I will see what I can do, but at the moment, I cannot say that I will have this ready by XYZ deadline. If anyone else would like to go towards that direction, please go ahead.
Sure, I get that. However neomodel...
Thanks, I appreciate getting into the trouble of creating the repository and because of this I thought I'd clarify that the way that case fails is clear (to me) and it would not need further demonstration.
I've tested the @oikonomou changes, but to no avail! The issue persists!
@mostafa Thank you for testing this out. Less exclamation marks please. We are all pitching in to this as much as we can.
Can you please confirm that your test-case is not any different than the one described here?
@robertlagrant A possible way forward if waitress
simply does multiple threads (?). Please see the pull request (work not completed as far as I can tell) and discussion there.
I've tested the @oikonomou changes, but to no avail! The issue persists!
What is your setup? Do you use django? Or something else? Which python version are you using?
We've been testing it in Python 3.6 with latest django 2.1 and it works, also on more evolved cases. Furthermore other testcases (in py27) passes.
@jonadaly @manoadamro @theterg @mzgajner @mostafa @mjmare @MardanovTimur @phoebebright @robertlagrant @robinedwards @jberends @oikonomou
Can I please ask you guys to give it a go and test this branch?
It incorporates a number of pull requests the most important of which for this issue is the one by @oikonomou that so far seems to have solved this issue:
main_naive.py
);@mjmare, @robertlagrant talked about Pylons / Pyramid before so it would be nice to hear from you if it works there too, although, I do not see why it should not.
Looking forward to hearing from you
@aanastasiou I can confirm that the branch you provided works without flaws.
@jberends Thanks for taking the time to perform any additional tests, just bear with me as I am going through a number of other contributions at the moment prior to making the next release please.
@aanastasiou The changes appears to work:
Subsequent switching between the versions has the behaviour you would expect (3.2.9 and the above branch work; 3.3.0 results in the node registry problems).
@jonadaly Thank you for taking the time to do this.
We've had it running in dev for a couple of weeks and it seems stable to us.
@aanastasiou Given that this seems to be working, when can we expect to see a release with this fix included?
(Appreciate the hard work from all parties investigating and fixing - it's just this has been an issue for a while now so would be good to get it resolved ASAP!)
Hello everyone, the release was made last Friday. We are now on 3.3.1. I am sorry I did not make a special announcement.
I would also like to thank everyone who took the time to look into any aspect of this issue and give us feedback or suggestions.
All the best
@aanastasiou
Did you write any changelog or other releasenotes? Next time I can help with that if you like.
@jberends Yes, I did:
Version 3.3.1 2019-02-08
* Fixed a number of warnings due to deprecations both within neomodel and pytest and overall improvements in
code style (#381) - Abhishek Modi
* Added support for spatial data through neomodel.contrib.spatial_datatypes (#384) - Athanasios Anastasiou
* Added the ability to filter "left-hand statements" (in NodeSet expressions) too (#395) - Grzegorz Grzywacz
* Refactor the Node Class Registry to make util.Database a true Singleton (#403) - Giorgos Oikonomou
* Many thanks to Giorgos Oikonomou, Jon Daly, Adam Romano, Andrew Tergis, Mato Žgajner, Mostafa Moradian, mjmare,
Phoebe Bright, Robert Grant, jberends and anyone else who helped flag, track and correct this bug. For more
information, please see: https://github.com/neo4j-contrib/neomodel/issues/378
* Added the ClassAlreadyDefined exception to prevent against accidental redefinitions of data model classes (#409)
- Athanasios Anastasiou
* Added the ability to lazily `.nodes.get()` and `.nodes.all()` (#406) - Matan Hart
* Fixed a bug in the assumed direction of relationships in _build_merge_query (#408) - MrAnde7son
Will make sure that these changes are reflected in the repository too.
We have a neo4j instance, populated with the help of neomodel v3.2.9. The v3.3.0 of neomodel breaks when attempting to initialise the connection to the database:
This appears to be as a result of the self._NODE_CLASS_REGISTRY changes included in v3.3.0, which appear to be breaking. Is this a known issue? Given that this is a minor version bump, we would not expect breaking changes to be included.