navis-org / pymaid

Python library to interface with CATMAID servers. Fully interoperable with navis.
https://pymaid.readthedocs.io/en/latest/
GNU General Public License v3.0
23 stars 11 forks source link

Improvement to upload_neuron's handling of connectors and connector tags #199

Closed jasper-tms closed 3 years ago

jasper-tms commented 3 years ago

Closes #196

Hey Philipp, finally getting around to trying to push you these changes I wrote a few months ago. The the motivation and logic for the new reuse_existing_connectors/check_existing parameters in upload_neuron/add_connector respectively are mostly described in the issue I opened above. Currently upload_neuron's parameter named reuse_existing_connectors is just passed to add_connector as the check_existing argument (see here.) I thought it made sense to give these parameters different names to clarify their purposes a bit for the two different functions, but they're functionally the same so could change them to be the same variable name if you think that'd be clearer.

With the new check_existing logic, add_connector unfortunately makes one catmaid API call for each connector to be uploaded. This is slow. It's much faster to have one API call for all the connectors that need to be uploaded. It's possible to code up logic that brings this function back to having one API call while still having check_existing happen, and I'm meaning to make a commit that accomplishes this in next few days, but wanted to open this pull request now to give you a chance to look at this and provide any feedback. I'll post here once I've gotten this last commit done, but don't accept the pull request before that happens please!

I additionally realized that connector tags were not being transferred, and I wanted this to happen too. To accomplish that, I had to add a new attribute connector_tags to CatmaidNeuron objects, since previously connector tags were not stored as part of CatmaidNeurons and so were not easily able to be uploaded. For the most part I took the tags attribute code and modified it to build the connector_tags attribute. This new attribute behaves as I would expect it to, as far as I've been able to see; myCatmaidNeuron.connector_tags and myCatmaidNeuronList.connector_tags both return dicts that are equivalent in structure to what you get when you ask for .tags, but of course for connectors instead. Having implemented this, it was straightforward to get upload_neuron to upload those connector tags as well.

If it turns out to be a pain to have two features implemented in one pull request like I'm doing here, I can certainly split it up, but my hunch is that combining a few small features into one PR will work out just fine.

jasper-tms commented 3 years ago

It appears that GitHub has this feature "draft pull request" that I've used here, where we're able to discuss the changes but you're not able to accept the merge until I change it from a "draft" to a full blown PR. Great. I'll be sure to change it to a full PR once I've made the last commits that restore faster execution to add_connector.

schlegelp commented 3 years ago

Hi Jasper. Thanks for these contributions! I'll try having a proper look tomorrow.

schlegelp commented 3 years ago

Looks very good overall! See my minor comments in the code.

Not super pressing from my point of view but one thing to consider: the .tags attribute is updated, for example, when we subset a neuron - i.e. we purge nodes from the dictionary that don't anymore exist in the subset neuron (see here).

We might want to do the same with the new .connector_tags?

schlegelp commented 3 years ago

Sorry for being slow to respond at the moment. In general, please feel free to ping me if I don't get back to you within a few days!

jasper-tms commented 3 years ago

Hah, you think you were slow with a few days' delay? I raise you a month delay. (My apologies.)

I think this PR could now be merged as it is – the last two outstanding points seem not 100% necessary to get to now:

  1. Perform many fetch-connectors-in-bbox commands in parallel. (This isn't so terribly slow when done serially.)
  2. Purge entries from connector_tags when subsetting a neuron. (Rarely relevant.)

One last thing to be aware of that I'd like your input on: when uploading a neuron with a mix of already-existing and just-uploaded connectors, the connector_response part of the server response some entries that look like {'connector_id': 16350520, 'connector_edition_time': '2020-11-10T16:28:29.164Z', 'created_links': []}, for the just-uploaded ones and some entries that look like {'connector_id': 16350495}, for the already-existing ones. Does it seem ok to leave it this way, with the two response types having a different set of keys? Do you think it'd be useful to change the responses in any other way, perhaps to explicitly indicate which connectors already existed versus which were just uploaded, or is this good enough?

schlegelp commented 3 years ago

Just merged the pull request.

Re your question: Not critical but I would minimally add a flag to the response like {'connector_id': 16350495, 'is_new': False}. Ideally each response would contain the same data no matter if the connector is new or old but I appreciate that populating that is an extra step that will cost time. I think as long as the docstring is explicit about the potential differences it should be fine for now - probably really only relevant for a very small group of people anyway.

jasper-tms commented 3 years ago

I suggest changing this to

[[coord[0]-1, coord[0]+1],
 [coord[1]-1, coord[1]+1],
 [coord[2]-1, coord[2]+1]]

I don't exactly know how it happened but I was uploading some neurons and trying to get their synapses to link to connectors that already existed in the target catmaid project, but due to something (differences in decimal precision between some old code and new code?? not positive), the uploaded connectors had coordinates very slightly different from the existing connectors in catmaid, and so they didn't connect. This issue would have been avoided if the above get_connectors_in_bbox command looked 1nm above and below the upload coordinates for already-existing connectors. I think it makes sense to make ±1nm the default search range to allow differences in decimal precision/rounding to be tolerated. (I think since this pull request is closed, I can't make this commit myself, otherwise I would.)

schlegelp commented 3 years ago

Hi Jasper. I indeed missed your last message here. Thanks for the catch! I have made that change with cb64f80a242cc4f556e8b966fb4149981946680f and release a new version to PyPI soon-in.