BlueBrain / MorphIO

A python and C++ library for reading and writing neuronal morphologies
https://morphio.readthedocs.io
Apache License 2.0
26 stars 22 forks source link

Cannot read this morphology #456

Closed marwan-abdellah closed 1 year ago

marwan-abdellah commented 1 year ago

While trying to load an SWC morphology using MorphIO, I get this error:

  File "/home/abdellah/blender/bluebrain-blender-3.1/blender-neuromorphovis/3.1/scripts/addons/neuromorphovis/nmv/file/readers/morphology/morphio_reader.py", line 184, in read_data_from_file
    morphio_morphology = Morphology(self.morphology_file)
morphio._morphio.SomaError: 
/abdellah1/diadem-challenge-reconstructions/104198-F-000000_seg001_linesetTransformRelease.swc:2:error
Found a soma point with a neurite as parent

The morphology is attached.

104198-F-000000_seg001_linesetTransformRelease.zip

eleftherioszisis commented 1 year ago

The first point is not of soma type (1) but rather a custom neurite type (6). Is that intended?

1 6 -84.61 -137.5 20.87 1.000000 -1

marwan-abdellah commented 1 year ago

@eleftherioszisis I just found it here http://www.flycircuit.tw/modules.php?name=browsing&parent=browsing&op=list_gene and was trying to build it. So I don't really know.

eleftherioszisis commented 1 year ago

Does it work if you replace 6 -> 1 in the first line? Sometimes the file formats are not standardized, and for swc the first rows should represent the soma.

mgeplf commented 1 year ago

first rows should represent the soma.

I can't remember if we're picky about the soma being the first rows (we are internally; they should be standardized); however, what don't allow is soma point with a neurite as parent, because that breaks all the iteration invariants: the soma is considered the root node.

marwan-abdellah commented 1 year ago

Hey @mgeplf I have just noticed over the last few days that there are several morphologies that identify the soma with an id that is not 1 in the SWC file. This makes me wonder whether we have to make MorphIO aware of such morphologies and load them anyways! These morphologies, while being readable by other SWC neuron readers cannot be processed neither by Brayns nor by NMV becuase we use MorphIO to load these morphologies.
I am attaching one of these morphologies downloaded from the BigNeuron project. 1_1_Live_2-2-2010_9-52-24_AM_med_Red.tif_uint8.v3dpbd.zip

CC: @jplanasc

marwan-abdellah commented 1 year ago

One more morphology with a branch ID of 18 !!!

# generated by Vaa3D Plugin sort_neuron_swc
# source file(s): /local1/home/Hanbo/Hanchuan_curated/checked3_fruitfly_taiwan_flycircuit/uint8_ChaMARCM-F000106_seg001.lsm_c_3.tif/uint8_ChaMARCM-F000106_seg001.lsm_c_3.tif.v3dpbd.swc
# id,type,x,y,z,r,pid
1 18 230 609 19 8 -1
2 18 230 608 19 8 1
3 18 230 607 19 7 2
4 18 230 606 19 6 3
5 18 230 605 20 5 4
6 18 230 604 20 5 5
7 18 230 603 20 4 6
8 18 230 602 20 4 7
9 18 230 601 21 3 8
10 18 230 600 21 3 9
11 18 230 599 21 3 10
12 18 230 598 21 3 11
13 18 230 597 21 3 12
14 18 230 596 21 3 13
15 18 230 595 21 3 14
16 18 230 594 22 3 15
17 18 230 593 22 3 16
18 18 230 592 22 3 17
19 18 230 591 22 3 18
20 18 230 590 22 3 19
....

Can't we just handle and branch with any id and consider it a basal dendrite just for the sake of loading it?

uint8_ChaMARCM-F000106_seg001.lsm_c_3.tif.v3dpbd.zip

mgeplf commented 1 year ago

Hey @mgeplf I have just noticed over the last few days that there are several morphologies that identify the soma with an id that is not 1 in the SWC file.

According to the "spec", 1 is the soma. I don't think it's wise to start trying to detect what is a soma, vs what is not a soma for files that are non-compliant.

Can't we just handle and branch with any id and consider it a basal dendrite just for the sake of loading it?

Arbitrarily assuming something is a basal dendrite also seems dangerous. We can add more "custom" types here: https://github.com/BlueBrain/MorphIO/blob/master/include/morphio/enums.h#L81 but we're up to 10 already, and there is some amount of consensus that above 10 should be invalid.

marwan-abdellah commented 1 year ago

@mgeplf I agree with you. Nonetheless, it is really interesting that several morphologies that are even used in certain cases for validation are really not following the standard format! This is really really bad. I would therefore close the ticket.

mgeplf commented 1 year ago

Nonetheless, it is really interesting that several morphologies [...] are really not following the standard format!

Yeah, I find it strange, considering the "specification" is so simple, and quite relaxed about things (too much so, IMO)

marwan-abdellah commented 1 year ago

@mgeplf This paper concludes that 166 datasets were given as gold standard for validation. What I can notice is that the vaa3d isused to create the reconstructions and then export the segmented skeletons to SWC files.

mgeplf commented 1 year ago

@mgeplf This paper concludes that 166 datasets were given as gold standard for validation.

Interesting. I see that Ascoli is an author, and there are many Allen people. There appears to be efforts to make the SWC specification more clear, and Ascoli is part of that, as are Allen people: https://swc-specification.readthedocs.io/en/latest/governance.html

This includes the specification: https://swc-specification.readthedocs.io/en/latest/swc.html which says:

The first point in the file must have a ParentID equal to -1, which represents the root point.

and that the soma is labeled as type 1 (the table doesn't display well, see this PR for the fix: https://github.com/INCF/swc-specification/pull/7)