cajal / pipeline

Schemata and libraries for neuroscientific data
GNU Lesser General Public License v3.0
20 stars 43 forks source link

Add proximity cell matching logic in meso #417

Closed eywalker closed 3 years ago

eywalker commented 3 years ago

ProximityCellMatch and BestProximityCellMatch tables, previously found and used in cajal/static-networks (https://github.com/cajal/static-networks/blob/d16094731eadbf44f45a7df648a0c264282927fc/staticnet_analyses/closed_loop.py#L931-L1029) is now migrated into meso of pipeline, right below where the StackCoordinates table is defined.

As part of the migration, the ProximityCellMatch table has been modified to fill one source unit at a time, so that there is no need to specify all the neurons one would want to fill for at once, and limiting the computation to the desired source units can be achieved simply via a restriction passed into populate. Furthermore, this allows for different unit's matches to be found in parallel, although the computational benefit of that is likely not huge.

As before BestProximityCellMatch is not state independent, in the sense that the result can vary depending on the content of the ProximityCellMatch and this relationship is not traceable. There are a number of ways to improve this, but the goal here was to migrate these tables outside of static-networks so that they can be used more broadly.

To facilitate the filling of ProximityCellMatch and BestProximityCellMatch tables, two manual tables are added: SourceUnitsToMatch and ScansToMatch. One should fill out the table SourceUnitsToMatch to designate all source units for which the proximity cell match in target scans should be computed. Similarly, you would fill ScansToMatch pairing the source and target scan to be matched. The idea is to have these tables filled via web GUI (i.e. shikigami) and the pipeline minion should always just run populate on ProximityCellMatch followed by BestProximityCellMatch. To ensure predictable behavior, the population of BestProximityCellMatch should never be performed by more than a single process at any one time.