navis-org / navis

Python library for analysis of neuroanatomical data.
https://navis.readthedocs.io
GNU General Public License v3.0
81 stars 29 forks source link

Traversal model make new nodes which is not in edges #121

Closed LeeJinmook closed 10 months ago

LeeJinmook commented 10 months ago

I used navis.models to use Traversal model. I have made my own Traversal model and used this for comparing them. However, when I used navis library, new nodes were made.

I used code as

edges=nx.to_pandas_edgelist(G)

for i in edges.index: edges.loc[i,'weight']=edges.loc[i,'weight']/G.in_degree(edges.loc[i,'target'],'weight')

model=TraversalModel(edges,seeds=Group,weights='weight',) model.run_parallel(n_cores=6,iterations=30) model.summary

In edges, all sources and targets were known nodes, but after I run the model new models were made. Could you help me?

schlegelp commented 10 months ago

but after I run the model new models were made.

I suppose you mean the model.summary ends up containing nodes that you don't have in your original edges?

Could you perhaps share the edges table and your seeds in Group - or alternatively copy-paste the output of the following:

>>> print(edges.head())  # print the first few rows of edges
>>> print(model.summary.head())  # print the first few rows of model summary
>>> all_nodes = edges[['source', 'target']].values.flatten()  
>>> print(model.summary[~model.summary.index.isin(all_nodes)])  # which nodes are in the summary but not in edges

PS: it would be helpful if you could format your code snippets.

LeeJinmook commented 10 months ago

I'm sorry that I am not used to git hub format. My data is here. Seeds are [720575940623804106,720575940608900050,720575940621814525,720575940657585025] and file google drive is here

LeeJinmook commented 10 months ago

I tried to check whether there are targets that is not in node list or not

edges[~edges.target.isin(all_nodes)]

and this code print nothing

schlegelp commented 10 months ago

and this code print nothing

You mean it returns an empty data frame? That's to be expected since all_nodes is the combination of all sources and targets in edges, and you are asking "give me all rows where the target is not in all_nodes.

In good news: I can reproduce your issue and have already found & fixed the cause. Turns out that under certain circumstances IDs were converted to float and then later converted back into int which causes 64bit integers such as FlyWire root IDs to get mangled due to floating point precision:

>>> int(float(720575940607438805))
720575940607438848

I changed the implementation of TraversalModel to avoid that issue and this fix will be included in the next release of navis. In the meantime, you will have to re-install navis directly from its Github repository to get the fixed version:

pip3 uninstall navis -y 
pip3 install git+https://github.com/navis-org/navis@master

Thanks for catching & reporting this! Please let me know if the fix works for you.

Finally, one last bit of advice: .pkl is not a good format for sharing data because (a) it only works if we are running compatible versions of pandas and (b) it's unsafe as it could in theory contain malicious code and I have no way checking beforehand. If I didn't know you from the FlyWire Slack I would not have opened that file on my machine. Best to share .feather or .csv files for DataFrames and .npy for just arrays.

Best, Philipp

Also: CC @sdorkenw

LeeJinmook commented 10 months ago

Yes all of your words were right. An empty data frame was returned and it means nothing was in the filter. The result was natural. Sorry for that. And your advice worked correctly. I re-installed navis module and it works now very well. Simulation works on I can do it as I wished.

Finally, thanks for your advice for common data format. As I did in first question, I'm not good at sharing or asking something. All kinds of advice make me better for next times. Thank you!

Jinmook Lee