BlueBrain / morphoclass

Neuronal morphology preparation and classification using Machine Learning.
https://morphoclass.readthedocs.io
Apache License 2.0
8 stars 4 forks source link

Chance accuracy seems to depend on the neurite #68

Closed FrancescoCasalegno closed 2 years ago

FrancescoCasalegno commented 2 years ago

Context

When looking at the results of the performance table it looks like, for some datasets, the chance accuracy depends on the choice of the neurite!

So either there is a bug in how we compute change accuracy or the ground truths used for each experiment are different. After some inspection, it looks like this is indeed due to different ground truths, because running

for chk in checkpoints:
     chk_f = Path(chk) / "checkpoint.chk"
     print(chk)
     # np.sorted(np.concatenate([split["ground_truths"] for split in d["splits"]]))
     d = torch.load(chk_f)
     x = np.sort(np.concatenate([split["ground_truths"] for split in d["splits"]]))
     print(f"{chance_agreement(x)} -- {list(x)}")
     print()

gives

./checkpoints-lida-alt-neurites/janelia-L5-all-cnn-tmd-stratified-k-fold
0.31285444234404536 -- [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]

./checkpoints-lida-alt-neurites/janelia-L5-axon-cnn-tmd-stratified-k-fold
0.31285444234404536 -- [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]

./checkpoints-lida-alt-neurites/janelia-L5-basal-cnn-tmd-stratified-k-fold
0.3165432098765432 -- [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]

./checkpoints-lida/janelia-L5-cnn-tmd-stratified-k-fold
0.3289795918367347 -- [0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3]

Actions

FrancescoCasalegno commented 2 years ago

Update

This issues seems to be related to the fact that different feature extraction methods produce a different number of output cells for the same dataset. That is, feature extraction probably fails for some morphologies.

After fixing some bugs (see in particular #73), we now have only two datasets where we can see different number of cells. Indeed, running (on feed8f456499b0cc1c234e000054a294d23668ef) the following snippet (run inside dvc/extract-features)

print("dataset                     n_cells")
print("----------------------------------------------")
for dataset in sorted(Path.cwd().glob("*")):
    n_cells = None
    ss = set()
    for dendrite in sorted(dataset.glob("*")):
        for method in sorted(dendrite.glob("*")):
            if method.is_file():
                continue
            n_cells_new = len(sorted(method.glob("*")))
            ss.add(n_cells_new)
    ss = ", ".join(str(s) for s in sorted(ss))
    print(f"{str(dataset.name):<25s}   {ss}")

gives the following output

dataset                     n_cells
----------------------------------------------
in-L1                       105
in-L23                      156
in-L4                       113
in-L5                       111
in-L6                       62
lida-in-merged              423
lida-in-merged-bc-merged    423
lida-janelia-L5             58
pc-L2                       40, 41, 43
pc-L3                       44
pc-L4                       89
pc-L5                       160
pc-L6                       125, 128, 129
FrancescoCasalegno commented 2 years ago

More specifically, here's what we see for pc-L2 and pc-L6:

for dataset in ["pc-L2", "pc-L6"]:
    print("-------", dataset, "-------")
    p = Path("/workdir/dvc/extract-features") / dataset
    for dendrite in sorted(p.glob("*")):
        for method in sorted(dendrite.glob("*")):
            if method.is_file():
                continue
            n_cells = len(list(method.glob("*")))
            print(f"[{n_cells}] {method}")
    print()
------- pc-L2 -------
[43] /workdir/dvc/extract-features/pc-L2/all/diagram-deepwalk
[43] /workdir/dvc/extract-features/pc-L2/all/diagram-tmd-proj
[43] /workdir/dvc/extract-features/pc-L2/all/graph-proj
[43] /workdir/dvc/extract-features/pc-L2/all/image-deepwalk
[43] /workdir/dvc/extract-features/pc-L2/all/image-tmd-proj
[43] /workdir/dvc/extract-features/pc-L2/apical/diagram-deepwalk
[43] /workdir/dvc/extract-features/pc-L2/apical/diagram-tmd-proj
[43] /workdir/dvc/extract-features/pc-L2/apical/graph-proj
[43] /workdir/dvc/extract-features/pc-L2/apical/image-deepwalk
[43] /workdir/dvc/extract-features/pc-L2/apical/image-tmd-proj
[41] /workdir/dvc/extract-features/pc-L2/axon/diagram-deepwalk
[41] /workdir/dvc/extract-features/pc-L2/axon/diagram-tmd-proj
[41] /workdir/dvc/extract-features/pc-L2/axon/graph-proj
[41] /workdir/dvc/extract-features/pc-L2/axon/image-deepwalk
[40] /workdir/dvc/extract-features/pc-L2/axon/image-tmd-proj
[43] /workdir/dvc/extract-features/pc-L2/basal/diagram-deepwalk
[43] /workdir/dvc/extract-features/pc-L2/basal/diagram-tmd-proj
[43] /workdir/dvc/extract-features/pc-L2/basal/graph-proj
[43] /workdir/dvc/extract-features/pc-L2/basal/image-deepwalk
[43] /workdir/dvc/extract-features/pc-L2/basal/image-tmd-proj

------- pc-L6 -------
[129] /workdir/dvc/extract-features/pc-L6/all/diagram-deepwalk
[129] /workdir/dvc/extract-features/pc-L6/all/diagram-tmd-proj
[129] /workdir/dvc/extract-features/pc-L6/all/graph-proj
[129] /workdir/dvc/extract-features/pc-L6/all/image-deepwalk
[129] /workdir/dvc/extract-features/pc-L6/all/image-tmd-proj
[129] /workdir/dvc/extract-features/pc-L6/apical/diagram-deepwalk
[129] /workdir/dvc/extract-features/pc-L6/apical/diagram-tmd-proj
[129] /workdir/dvc/extract-features/pc-L6/apical/graph-proj
[129] /workdir/dvc/extract-features/pc-L6/apical/image-deepwalk
[129] /workdir/dvc/extract-features/pc-L6/apical/image-tmd-proj
[128] /workdir/dvc/extract-features/pc-L6/axon/diagram-deepwalk
[128] /workdir/dvc/extract-features/pc-L6/axon/diagram-tmd-proj
[128] /workdir/dvc/extract-features/pc-L6/axon/graph-proj
[128] /workdir/dvc/extract-features/pc-L6/axon/image-deepwalk
[125] /workdir/dvc/extract-features/pc-L6/axon/image-tmd-proj
[129] /workdir/dvc/extract-features/pc-L6/basal/diagram-deepwalk
[129] /workdir/dvc/extract-features/pc-L6/basal/diagram-tmd-proj
[129] /workdir/dvc/extract-features/pc-L6/basal/graph-proj
[129] /workdir/dvc/extract-features/pc-L6/basal/image-deepwalk
[128] /workdir/dvc/extract-features/pc-L6/basal/image-tmd-proj
FrancescoCasalegno commented 2 years ago
Running stage 'features-pc-L2-diagram-deepwalk-axon':
> morphoclass -v extract-features data/final/pyramidal-cells/L2/dataset.csv axon diagram-deepwalk extract-features/pc-L2/axon/diagram-deepwalk
11:46:30 morphoclass.console.main (I) Running them morphoclass entrypoint
11:46:30 morphoclass.console.cmd_extract_features (I) Loading modules and libraries
11:46:34 morphoclass.console.cmd_extract_features (I) Starting feature extraction
11:46:34 morphoclass.console.cmd_extract_features (I) Setting up pre-transforms
11:46:34 morphoclass.console.cmd_extract_features (I) Loading data
11:46:34 morphoclass.console.cmd_extract_features (E) Some morphologies had neurites with a total neurite node count less than 3. This is too little for feature extraction and we'll therefor remove these morphologies from the dataset. Consider inspecting the data to find the cause. The morphologies to remove are:
* data/final/pyramidal-cells/L2/IPC/mtC110800E_idA.h5
* data/final/pyramidal-cells/L2/TPC_B/C090905B.h5
11:46:34 morphoclass.console.cmd_extract_features (I) Extracting features
11:46:39 morphoclass.console.cmd_extract_features (I) Setting the path attributes
11:46:39 morphoclass.console.cmd_extract_features (I) Saving extracted features to disk
11:46:39 morphoclass.console.cmd_extract_features (I) Done.
Updating lock file 'dvc.lock'

To track the changes with git, run:

    git add dvc.lock

To enable auto staging, run:

--
Running stage 'features-pc-L2-diagram-tmd-proj-axon':
> morphoclass -v extract-features data/final/pyramidal-cells/L2/dataset.csv axon diagram-tmd-proj extract-features/pc-L2/axon/diagram-tmd-proj
11:48:07 morphoclass.console.main (I) Running them morphoclass entrypoint
11:48:07 morphoclass.console.cmd_extract_features (I) Loading modules and libraries
11:48:11 morphoclass.console.cmd_extract_features (I) Starting feature extraction
11:48:11 morphoclass.console.cmd_extract_features (I) Setting up pre-transforms
11:48:11 morphoclass.console.cmd_extract_features (I) Loading data
11:48:12 morphoclass.console.cmd_extract_features (E) Some morphologies had neurites with a total neurite node count less than 3. This is too little for feature extraction and we'll therefor remove these morphologies from the dataset. Consider inspecting the data to find the cause. The morphologies to remove are:
* data/final/pyramidal-cells/L2/IPC/mtC110800E_idA.h5
* data/final/pyramidal-cells/L2/TPC_B/C090905B.h5
11:48:12 morphoclass.console.cmd_extract_features (I) Extracting features
11:48:12 morphoclass.console.cmd_extract_features (I) Setting the path attributes
11:48:12 morphoclass.console.cmd_extract_features (I) Saving extracted features to disk
11:48:12 morphoclass.console.cmd_extract_features (I) Done.
Updating lock file 'dvc.lock'

To track the changes with git, run:

    git add dvc.lock

To enable auto staging, run:

--
Running stage 'features-pc-L2-graph-proj-axon':
> morphoclass -v extract-features data/final/pyramidal-cells/L2/dataset.csv axon graph-proj extract-features/pc-L2/axon/graph-proj
11:49:37 morphoclass.console.main (I) Running them morphoclass entrypoint
11:49:37 morphoclass.console.cmd_extract_features (I) Loading modules and libraries
11:49:41 morphoclass.console.cmd_extract_features (I) Starting feature extraction
11:49:41 morphoclass.console.cmd_extract_features (I) Setting up pre-transforms
11:49:41 morphoclass.console.cmd_extract_features (I) Loading data
11:49:41 morphoclass.console.cmd_extract_features (E) Some morphologies had neurites with a total neurite node count less than 3. This is too little for feature extraction and we'll therefor remove these morphologies from the dataset. Consider inspecting the data to find the cause. The morphologies to remove are:
* data/final/pyramidal-cells/L2/IPC/mtC110800E_idA.h5
* data/final/pyramidal-cells/L2/TPC_B/C090905B.h5
11:49:41 morphoclass.console.cmd_extract_features (I) Extracting features
11:49:41 morphoclass.console.cmd_extract_features (I) Setting the path attributes
11:49:41 morphoclass.console.cmd_extract_features (I) Saving extracted features to disk
11:49:41 morphoclass.console.cmd_extract_features (I) Done.
Updating lock file 'dvc.lock'

To track the changes with git, run:

    git add dvc.lock

To enable auto staging, run:

--
Running stage 'features-pc-L2-image-deepwalk-axon':
> morphoclass -v extract-features data/final/pyramidal-cells/L2/dataset.csv axon image-deepwalk extract-features/pc-L2/axon/image-deepwalk
11:51:22 morphoclass.console.main (I) Running them morphoclass entrypoint
11:51:22 morphoclass.console.cmd_extract_features (I) Loading modules and libraries
11:51:26 morphoclass.console.cmd_extract_features (I) Starting feature extraction
11:51:26 morphoclass.console.cmd_extract_features (I) Setting up pre-transforms
11:51:26 morphoclass.console.cmd_extract_features (I) Loading data
11:51:26 morphoclass.console.cmd_extract_features (E) Some morphologies had neurites with a total neurite node count less than 3. This is too little for feature extraction and we'll therefor remove these morphologies from the dataset. Consider inspecting the data to find the cause. The morphologies to remove are:
* data/final/pyramidal-cells/L2/IPC/mtC110800E_idA.h5
* data/final/pyramidal-cells/L2/TPC_B/C090905B.h5
11:51:26 morphoclass.console.cmd_extract_features (I) Extracting features
11:51:31 morphoclass.console.cmd_extract_features (I) Converting diagrams to images
11:51:31 morphoclass.console.cmd_extract_features (I) Setting the path attributes
11:51:31 morphoclass.console.cmd_extract_features (I) Saving extracted features to disk
11:51:31 morphoclass.console.cmd_extract_features (I) Done.
Updating lock file 'dvc.lock'

To track the changes with git, run:

    git add dvc.lock

To enable auto staging, run:
--
Running stage 'features-pc-L2-image-tmd-proj-axon':
> morphoclass -v extract-features data/final/pyramidal-cells/L2/dataset.csv axon image-tmd-proj extract-features/pc-L2/axon/image-tmd-proj
11:53:01 morphoclass.console.main (I) Running them morphoclass entrypoint
11:53:01 morphoclass.console.cmd_extract_features (I) Loading modules and libraries
11:53:05 morphoclass.console.cmd_extract_features (I) Starting feature extraction
11:53:05 morphoclass.console.cmd_extract_features (I) Setting up pre-transforms
11:53:05 morphoclass.console.cmd_extract_features (I) Loading data
11:53:05 morphoclass.console.cmd_extract_features (E) Some morphologies had neurites with a total neurite node count less than 3. This is too little for feature extraction and we'll therefor remove these morphologies from the dataset. Consider inspecting the data to find the cause. The morphologies to remove are:
* data/final/pyramidal-cells/L2/IPC/mtC110800E_idA.h5
* data/final/pyramidal-cells/L2/TPC_B/C090905B.h5
11:53:05 morphoclass.console.cmd_extract_features (I) Extracting features
11:53:05 morphoclass.console.cmd_extract_features (I) Converting diagrams to images
11:53:05 morphoclass.console.cmd_extract_features (E) Some diagrams had fewer than 3 points. This is too few toto compute persistence images and we'll therefore remove these morphologies from the dataset. Consider inspecting the data to find the cause. The morphologies to remove are:
* data/final/pyramidal-cells/L2/TPC_B/sm100617a1-4_idC.h5
11:53:06 morphoclass.console.cmd_extract_features (I) Setting the path attributes
11:53:06 morphoclass.console.cmd_extract_features (I) Saving extracted features to disk
11:53:06 morphoclass.console.cmd_extract_features (I) Done.
Updating lock file 'dvc.lock'

To track the changes with git, run:

    git add dvc.lock
FrancescoCasalegno commented 2 years ago
Running stage 'features-pc-L6-diagram-deepwalk-axon':
> morphoclass -v extract-features data/final/pyramidal-cells/L6/dataset.csv axon diagram-deepwalk extract-features/pc-L6/axon/diagram-deepwalk
12:26:33 morphoclass.console.main (I) Running them morphoclass entrypoint
12:26:33 morphoclass.console.cmd_extract_features (I) Loading modules and libraries
12:26:37 morphoclass.console.cmd_extract_features (I) Starting feature extraction
12:26:37 morphoclass.console.cmd_extract_features (I) Setting up pre-transforms
12:26:37 morphoclass.console.cmd_extract_features (I) Loading data
12:26:37 morphoclass.console.cmd_extract_features (E) Some morphologies had neurites with a total neurite node count less than 3. This is too little for feature extraction and we'll therefor remove these morphologies from the dataset. Consider inspecting the data to find the cause. The morphologies to remove are:
* data/final/pyramidal-cells/L6/TPC_A/Fluo58_right.h5
12:26:37 morphoclass.console.cmd_extract_features (I) Extracting features
12:26:49 morphoclass.console.cmd_extract_features (I) Setting the path attributes
12:26:49 morphoclass.console.cmd_extract_features (I) Saving extracted features to disk
12:26:49 morphoclass.console.cmd_extract_features (I) Done.
Updating lock file 'dvc.lock'

To track the changes with git, run:
--
Running stage 'features-pc-L6-diagram-tmd-proj-axon':
> morphoclass -v extract-features data/final/pyramidal-cells/L6/dataset.csv axon diagram-tmd-proj extract-features/pc-L6/axon/diagram-tmd-proj
12:28:28 morphoclass.console.main (I) Running them morphoclass entrypoint
12:28:28 morphoclass.console.cmd_extract_features (I) Loading modules and libraries
12:28:32 morphoclass.console.cmd_extract_features (I) Starting feature extraction
12:28:32 morphoclass.console.cmd_extract_features (I) Setting up pre-transforms
12:28:32 morphoclass.console.cmd_extract_features (I) Loading data
12:28:33 morphoclass.console.cmd_extract_features (E) Some morphologies had neurites with a total neurite node count less than 3. This is too little for feature extraction and we'll therefor remove these morphologies from the dataset. Consider inspecting the data to find the cause. The morphologies to remove are:
* data/final/pyramidal-cells/L6/TPC_A/Fluo58_right.h5
12:28:33 morphoclass.console.cmd_extract_features (I) Extracting features
12:28:33 morphoclass.console.cmd_extract_features (I) Setting the path attributes
12:28:33 morphoclass.console.cmd_extract_features (I) Saving extracted features to disk
12:28:33 morphoclass.console.cmd_extract_features (I) Done.
Updating lock file 'dvc.lock'

To track the changes with git, run:
--
Running stage 'features-pc-L6-graph-proj-axon':
> morphoclass -v extract-features data/final/pyramidal-cells/L6/dataset.csv axon graph-proj extract-features/pc-L6/axon/graph-proj
12:30:05 morphoclass.console.main (I) Running them morphoclass entrypoint
12:30:05 morphoclass.console.cmd_extract_features (I) Loading modules and libraries
12:30:09 morphoclass.console.cmd_extract_features (I) Starting feature extraction
12:30:09 morphoclass.console.cmd_extract_features (I) Setting up pre-transforms
12:30:09 morphoclass.console.cmd_extract_features (I) Loading data
12:30:10 morphoclass.console.cmd_extract_features (E) Some morphologies had neurites with a total neurite node count less than 3. This is too little for feature extraction and we'll therefor remove these morphologies from the dataset. Consider inspecting the data to find the cause. The morphologies to remove are:
* data/final/pyramidal-cells/L6/TPC_A/Fluo58_right.h5
12:30:10 morphoclass.console.cmd_extract_features (I) Extracting features
12:30:10 morphoclass.console.cmd_extract_features (I) Setting the path attributes
12:30:10 morphoclass.console.cmd_extract_features (I) Saving extracted features to disk
12:30:10 morphoclass.console.cmd_extract_features (I) Done.
Updating lock file 'dvc.lock'

To track the changes with git, run:
--
Running stage 'features-pc-L6-image-deepwalk-axon':
> morphoclass -v extract-features data/final/pyramidal-cells/L6/dataset.csv axon image-deepwalk extract-features/pc-L6/axon/image-deepwalk
12:31:33 morphoclass.console.main (I) Running them morphoclass entrypoint
12:31:33 morphoclass.console.cmd_extract_features (I) Loading modules and libraries
12:31:37 morphoclass.console.cmd_extract_features (I) Starting feature extraction
12:31:37 morphoclass.console.cmd_extract_features (I) Setting up pre-transforms
12:31:37 morphoclass.console.cmd_extract_features (I) Loading data
12:31:38 morphoclass.console.cmd_extract_features (E) Some morphologies had neurites with a total neurite node count less than 3. This is too little for feature extraction and we'll therefor remove these morphologies from the dataset. Consider inspecting the data to find the cause. The morphologies to remove are:
* data/final/pyramidal-cells/L6/TPC_A/Fluo58_right.h5
12:31:38 morphoclass.console.cmd_extract_features (I) Extracting features
12:31:48 morphoclass.console.cmd_extract_features (I) Converting diagrams to images
12:31:50 morphoclass.console.cmd_extract_features (I) Setting the path attributes
12:31:50 morphoclass.console.cmd_extract_features (I) Saving extracted features to disk
12:31:50 morphoclass.console.cmd_extract_features (I) Done.
Updating lock file 'dvc.lock'

--
Running stage 'features-pc-L6-image-tmd-proj-axon':
> morphoclass -v extract-features data/final/pyramidal-cells/L6/dataset.csv axon image-tmd-proj extract-features/pc-L6/axon/image-tmd-proj
14:48:42 morphoclass.console.main (I) Running them morphoclass entrypoint
14:48:42 morphoclass.console.cmd_extract_features (I) Loading modules and libraries
14:48:46 morphoclass.console.cmd_extract_features (I) Starting feature extraction
14:48:46 morphoclass.console.cmd_extract_features (I) Setting up pre-transforms
14:48:46 morphoclass.console.cmd_extract_features (I) Loading data
14:48:47 morphoclass.console.cmd_extract_features (E) Some morphologies had neurites with a total neurite node count less than 3. This is too little for feature extraction and we'll therefor remove these morphologies from the dataset. Consider inspecting the data to find the cause. The morphologies to remove are:
* data/final/pyramidal-cells/L6/TPC_A/Fluo58_right.h5
14:48:47 morphoclass.console.cmd_extract_features (I) Extracting features
14:48:47 morphoclass.console.cmd_extract_features (I) Converting diagrams to images
14:48:47 morphoclass.console.cmd_extract_features (E) Some diagrams had fewer than 3 points. This is too few toto compute persistence images and we'll therefore remove these morphologies from the dataset. Consider inspecting the data to find the cause. The morphologies to remove are:
* data/final/pyramidal-cells/L6/UPC/tkb060128_a1-a2_idD.h5
* data/final/pyramidal-cells/L6/TPC_A/tkb060510b2_ch5_ct_n_db_100x_1.h5
* data/final/pyramidal-cells/L6/IPC/C291101C2.h5
14:48:48 morphoclass.console.cmd_extract_features (I) Setting the path attributes
14:48:48 morphoclass.console.cmd_extract_features (I) Saving extracted features to disk
14:48:48 morphoclass.console.cmd_extract_features (I) Done.
Updating lock file 'dvc.lock'

To track the changes with git, run:

--
Running stage 'features-pc-L6-image-tmd-proj-basal':
> morphoclass -v extract-features data/final/pyramidal-cells/L6/dataset.csv basal image-tmd-proj extract-features/pc-L6/basal/image-tmd-proj
15:30:26 morphoclass.console.main (I) Running them morphoclass entrypoint
15:30:26 morphoclass.console.cmd_extract_features (I) Loading modules and libraries
15:30:30 morphoclass.console.cmd_extract_features (I) Starting feature extraction
15:30:30 morphoclass.console.cmd_extract_features (I) Setting up pre-transforms
15:30:30 morphoclass.console.cmd_extract_features (I) Loading data
15:30:31 morphoclass.console.cmd_extract_features (I) Extracting features
15:30:32 morphoclass.console.cmd_extract_features (I) Converting diagrams to images
15:30:32 morphoclass.console.cmd_extract_features (E) Some diagrams had fewer than 3 points. This is too few toto compute persistence images and we'll therefore remove these morphologies from the dataset. Consider inspecting the data to find the cause. The morphologies to remove are:
* data/final/pyramidal-cells/L6/BPC/rp101228_L5-1_idA.h5
15:30:32 morphoclass.console.cmd_extract_features (I) Setting the path attributes
15:30:32 morphoclass.console.cmd_extract_features (I) Saving extracted features to disk
15:30:32 morphoclass.console.cmd_extract_features (I) Done.
Updating lock file 'dvc.lock'
FrancescoCasalegno commented 2 years ago

Here are the problematic morphologies. Notice that we are displaying the final morphologies, but the raw ones look exactly the same. Plotted using Morphology Viewer.

Pyramidal Cells - L2

Axon, All Feature Extraction Methods

Axon, Image TMD Proj

Pyramidal Cells - L6

Axon, All Feature Extraction Methods

Fluo58_right

Axon, Image TMD Proj

tkb060128_a1-a2_idD tkb060510b2_ch5_ct_n_db_100x_1 C291101C2

Basal, Image TMD Proj

rp101228_L5-1_idA
FrancescoCasalegno commented 2 years ago

Conclusions

Based on the results above https://github.com/BlueBrain/morphoclass/issues/68#issuecomment-1172346264 we can say the following.

  1. The issues that cause feature extraction to fail for some morphologies are not due to the data pre-processing (e.g. MCAR curation). Indeed, the abnormalities are visibile before (raw) as well as after (final) the data prep.
  2. The issue that cause all feature extraction methods to fail is the absence of any bifurcation in the neurite. This is observed in the following cases:
    • data/final/pyramidal-cells/L2/IPC/mtC110800E_idA.h5axon has no bifurcations
    • data/final/pyramidal-cells/L2/TPC_B/C090905B.h5axon has no bifurcations
    • data/final/pyramidal-cells/L6/TPC_A/Fluo58_right.h5axon has no bifurcations
  3. The issue that cause only Image TMD Proj to fail is the presence of only one bifurcation, so that the TMD diagram has < 3 points, and this apparently "is too few to compute persistence images"[^1]. This is observed in the following cases:
    • data/final/pyramidal-cells/L2/TPC_B/sm100617a1-4_idC.h5axon has only 1 bifurcation
    • data/final/pyramidal-cells/L6/UPC/tkb060128_a1-a2_idD.h5axon has only 1 bifurcation
    • data/final/pyramidal-cells/L6/TPC_A/tkb060510b2_ch5_ct_n_db_100x_1.h5axon has only 1 bifurcation
    • data/final/pyramidal-cells/L6/IPC/C291101C2.h5axon has only 1 bifurcation
    • data/final/pyramidal-cells/L6/BPC/rp101228_L5-1_idA.h5basal has only 1 bifurcation

[^1]: On the other side, no error is raised when computing the TMD diagram! So we are doing this check, e.g., for image-tmd-proj but not for diagram-tmd-proj, see here: https://github.com/BlueBrain/morphoclass/blob/ebd177df20ac49a482b1dda6466f82401534c669/src/morphoclass/console/cmd_extract_features.py#L235

FrancescoCasalegno commented 2 years ago

@lidakanari

  1. Can we even solve this issue? It seems to be intrinsic to the morphologies, and if there is 0 or 1 bifurcations does it really make sense to perform feature extraction?
  2. Any idea why the check "Some diagrams had fewer than 3 points ..." (here) is performed only for TMD images but not for TMD diagrams?
FrancescoCasalegno commented 2 years ago

2022-07-05 Meeting with @lidakanari

[^1]: This function is used by the TMD util tmd.Topology.analysis.get_persistence_image_data() to compute the TMD Image from the TMD Diagram. And this TMD util is being called by our morphoclass extract-features command. [^2]: Because it's what works best. As a reminder, the default neurite=apical is also used for janelia (which are also pyramidal cells!) while neurite=axon is used for interneurons.