janelia-flyem / gala

Automatic segmentation of electron microscopy volumes
BSD 3-Clause "New" or "Revised" License
75 stars 29 forks source link

Request Docker image or updated version of GALA #97

Open Khoa-NT opened 3 years ago

Khoa-NT commented 3 years ago

Dear Gala team,

thank you for creating GALA . Currently, the library has a dependency problem.

I tried to solve it by changing the source code to fit with the current library like scikit-image, etc. However, I still cannot fix it.

Do you have any plans to maintained it? or do you have the docker image for GALA?

jni commented 3 years ago

Hi @Khoa-NT!

A blast from the past! =) Can you tell me what you would like to use from gala specifically? Indeed I have not been maintaining it for a while, to the point that pip install gala is currently something else. I gave away the name to an astronomy package with a growing user community.

But anyway, I would be happy to help you get this going. We don't have a docker image, no. I think the fastest path to getting things working would be to update the code for recent versions of libraries like you were doing. Perhaps you can tell me where you are getting stuck?

Khoa-NT commented 3 years ago

Hi @jni ,

Thank you for your reply.

Can you tell me what you would like to use from gala specifically?

I would like to get the oversegmentation result from GALA for EM image dataset.

But anyway, I would be happy to help you get this going. We don't have a docker image, no. I think the fastest path to getting things working would be to update the code for recent versions of libraries like you were doing. Perhaps you can tell me where you are getting stuck?

These are what I have done when I tried to run python test_agglo.py following this guide:

ImportError: cannot import name 'comb' https://stackoverflow.com/questions/47151453/sklearn-import-error-importerror-cannot-import-name-comb Replace from scipy.misc import comb as nchoosek with from scipy.special import comb as nchoosek

ImportError: cannot import name 'joblib' https://stackoverflow.com/questions/61893719/importerror-cannot-import-name-joblib-from-sklearn-externals Replace from sklearn.externals import joblib with import joblib

ImportError: cannot import name 'factorial' https://stackoverflow.com/questions/56283294/importerror-cannot-import-name-factorial Replace from scipy.special import factorial with from scipy.special import factorial

The next error is

python test_agglo.py
 25%|█████████████████████▊                                                                 | 1/4 [00:00<?, ?it/s]
 67%|██████████████████████████████████████████████████████████                             | 2/3 [00:00<?, ?it/s]
 25%|█████████████████████▊                                                                 | 1/4 [00:00<?, ?it/s]
 67%|██████████████████████████████████████████████████████████                             | 2/3 [00:00<?, ?it/s]
EEE.EEEEE
======================================================================
ERROR: test_agglo.test_agglomeration
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\nose\case.py", line 197, in runTest
    self.test(*self.arg)
  File "A:\Segmentation\GALA\gala\tests\test_agglo.py", line 53, in test_agglomeration
    normalize_probabilities=True)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 500, in __init__
    self.build_graph_from_watershed()
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 640, in build_graph_from_watershed
    self.build_edges_fast()
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 678, in build_edges_fast
    edge_map, self.boundaries = agglo2.sparse_boundaries(edges_coo)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo2.py", lin
e 67, in sparse_boundaries
    bounds = sparselol.extents(edge_labels, input_indices=coo_boundaries.data)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\sparselol.py",
line 44, in extents
    extents_count(labels.ravel(), indptr.copy(), input_indices, out=indices)
  File "gala\sparselol_cy.pyx", line 8, in gala.sparselol_cy.extents_count
ValueError: Buffer dtype mismatch, expected 'Py_ssize_t' but got 'long'

======================================================================
ERROR: test_agglo.test_ladder_agglomeration
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\nose\case.py", line 197, in runTest
    self.test(*self.arg)
  File "A:\Segmentation\GALA\gala\tests\test_agglo.py", line 66, in test_ladder_agglomeration
    assert_allclose(ev.vi(g.get_segmentation(), results[i]), 0.0,
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\evaluate.py", l
ine 746, in vi
    return np.dot(weights, split_vi(x, y, ignore_x, ignore_y))
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\evaluate.py", l
ine 781, in split_vi
    _, _, _ , hxgy, hygx, _, _ = vi_tables(x, y, ignore_x, ignore_y)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\evaluate.py", l
ine 1166, in vi_tables
    pxy = contingency_table(x, y, ignore_seg=ignore_x, ignore_gt=ignore_y)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\evaluate.py", l
ine 296, in contingency_table
    cont = sparse.coo_matrix((data, (segr, gtr))).tocsr()
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\scipy\sparse\coo.py", line 149, in __init__
    N = operator.index(np.max(col)) + 1
TypeError: 'numpy.float64' object cannot be interpreted as an integer

======================================================================
ERROR: test_agglo.test_no_dam_agglomeration
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\nose\case.py", line 197, in runTest
    self.test(*self.arg)
  File "A:\Segmentation\GALA\gala\tests\test_agglo.py", line 72, in test_no_dam_agglomeration
    normalize_probabilities=True)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 500, in __init__
    self.build_graph_from_watershed()
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 640, in build_graph_from_watershed
    self.build_edges_fast()
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 678, in build_edges_fast
    edge_map, self.boundaries = agglo2.sparse_boundaries(edges_coo)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo2.py", lin
e 67, in sparse_boundaries
    bounds = sparselol.extents(edge_labels, input_indices=coo_boundaries.data)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\sparselol.py",
line 44, in extents
    extents_count(labels.ravel(), indptr.copy(), input_indices, out=indices)
  File "gala\sparselol_cy.pyx", line 8, in gala.sparselol_cy.extents_count
ValueError: Buffer dtype mismatch, expected 'Py_ssize_t' but got 'long'

======================================================================
ERROR: test_agglo.test_mito
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\nose\case.py", line 197, in runTest
    self.test(*self.arg)
  File "A:\Segmentation\GALA\gala\tests\test_agglo.py", line 90, in test_mito
    assert_allclose(ev.vi(g.get_segmentation(), results[i]), 0.0,
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\evaluate.py", l
ine 746, in vi
    return np.dot(weights, split_vi(x, y, ignore_x, ignore_y))
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\evaluate.py", l
ine 781, in split_vi
    _, _, _ , hxgy, hygx, _, _ = vi_tables(x, y, ignore_x, ignore_y)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\evaluate.py", l
ine 1166, in vi_tables
    pxy = contingency_table(x, y, ignore_seg=ignore_x, ignore_gt=ignore_y)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\evaluate.py", l
ine 296, in contingency_table
    cont = sparse.coo_matrix((data, (segr, gtr))).tocsr()
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\scipy\sparse\coo.py", line 149, in __init__
    N = operator.index(np.max(col)) + 1
TypeError: 'numpy.float64' object cannot be interpreted as an integer

======================================================================
ERROR: test_agglo.test_mask
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\nose\case.py", line 197, in runTest
    self.test(*self.arg)
  File "A:\Segmentation\GALA\gala\tests\test_agglo.py", line 101, in test_mask
    assert 3 not in g
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 844, in __contains__
    new_value = self.forward_map(value)
TypeError: 'dict' object is not callable

======================================================================
ERROR: test_agglo.test_traverse
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\nose\case.py", line 197, in runTest
    self.test(*self.arg)
  File "A:\Segmentation\GALA\gala\tests\test_agglo.py", line 111, in test_traverse
    g = agglo.Rag(np.array(labels))
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 500, in __init__
    self.build_graph_from_watershed()
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 640, in build_graph_from_watershed
    self.build_edges_fast()
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 678, in build_edges_fast
    edge_map, self.boundaries = agglo2.sparse_boundaries(edges_coo)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo2.py", lin
e 67, in sparse_boundaries
    bounds = sparselol.extents(edge_labels, input_indices=coo_boundaries.data)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\sparselol.py",
line 44, in extents
    extents_count(labels.ravel(), indptr.copy(), input_indices, out=indices)
  File "gala\sparselol_cy.pyx", line 8, in gala.sparselol_cy.extents_count
ValueError: Buffer dtype mismatch, expected 'Py_ssize_t' but got 'long'

======================================================================
ERROR: test_agglo.test_best_possible_segmentation
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\nose\case.py", line 197, in runTest
    self.test(*self.arg)
  File "A:\Segmentation\GALA\gala\tests\test_agglo.py", line 125, in test_best_possible_segmentation
    best = agglo.best_possible_segmentation(ws, gt)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 2112, in best_possible_segmentation
    ws = Rag(ws)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 500, in __init__
    self.build_graph_from_watershed()
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 640, in build_graph_from_watershed
    self.build_edges_fast()
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 678, in build_edges_fast
    edge_map, self.boundaries = agglo2.sparse_boundaries(edges_coo)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo2.py", lin
e 67, in sparse_boundaries
    bounds = sparselol.extents(edge_labels, input_indices=coo_boundaries.data)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\sparselol.py",
line 44, in extents
    extents_count(labels.ravel(), indptr.copy(), input_indices, out=indices)
  File "gala\sparselol_cy.pyx", line 8, in gala.sparselol_cy.extents_count
ValueError: Buffer dtype mismatch, expected 'Py_ssize_t' but got 'long'

======================================================================
ERROR: test_agglo.test_set_ground_truth
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\nose\case.py", line 197, in runTest
    self.test(*self.arg)
  File "A:\Segmentation\GALA\gala\tests\test_agglo.py", line 133, in test_set_ground_truth
    g = agglo.Rag(np.array(labels))
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 500, in __init__
    self.build_graph_from_watershed()
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 640, in build_graph_from_watershed
    self.build_edges_fast()
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 678, in build_edges_fast
    edge_map, self.boundaries = agglo2.sparse_boundaries(edges_coo)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo2.py", lin
e 67, in sparse_boundaries
    bounds = sparselol.extents(edge_labels, input_indices=coo_boundaries.data)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\sparselol.py",
line 44, in extents
    extents_count(labels.ravel(), indptr.copy(), input_indices, out=indices)
  File "gala\sparselol_cy.pyx", line 8, in gala.sparselol_cy.extents_count
ValueError: Buffer dtype mismatch, expected 'Py_ssize_t' but got 'long'

======================================================================
ERROR: test_agglo.test_split_vi
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\nose\case.py", line 197, in runTest
    self.test(*self.arg)
  File "A:\Segmentation\GALA\gala\tests\test_agglo.py", line 141, in test_split_vi
    g = agglo.Rag(np.array(labels))
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 500, in __init__
    self.build_graph_from_watershed()
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 640, in build_graph_from_watershed
    self.build_edges_fast()
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line
 678, in build_edges_fast
    edge_map, self.boundaries = agglo2.sparse_boundaries(edges_coo)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo2.py", lin
e 67, in sparse_boundaries
    bounds = sparselol.extents(edge_labels, input_indices=coo_boundaries.data)
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\sparselol.py",
line 44, in extents
    extents_count(labels.ravel(), indptr.copy(), input_indices, out=indices)
  File "gala\sparselol_cy.pyx", line 8, in gala.sparselol_cy.extents_count
ValueError: Buffer dtype mismatch, expected 'Py_ssize_t' but got 'long'

======================================================================
ERROR: test_agglo.test_manual_agglo_fast_rag
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\nose\case.py", line 197, in runTest
    self.test(*self.arg)
TypeError: test_manual_agglo_fast_rag() missing 1 required positional argument: 'dummy_data'

======================================================================
ERROR: test_agglo.test_mean_agglo_fast_rag
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\shaol\Anaconda3\envs\gala\lib\site-packages\nose\case.py", line 197, in runTest
    self.test(*self.arg)
TypeError: test_mean_agglo_fast_rag() missing 1 required positional argument: 'dummy_data'

----------------------------------------------------------------------
Ran 15 tests in 0.037s

FAILED (errors=11)
(gala)
Khoa-NT commented 3 years ago

Hi @jni, Do you have any plans for updating GALA?

jni commented 3 years ago

Yes, sorry, I started on this before the Christmas break but didn't manage to finish it. Having a bit of trouble getting tests to pass with networkx 2.x... I think things should be fixable though!

jni commented 3 years ago

@Khoa-NT the new version from master should work with modern libraries, see #98. Could you try to install it and see if it works?

jni commented 3 years ago

(I will warn you that gala is slow and inefficient — I do not expect it to be able to segment large volumes. How large is your dataset?)

Khoa-NT commented 3 years ago

Hi @jni I'm sorry for the late reply.

(I will warn you that gala is slow and inefficient — I do not expect it to be able to segment large volumes. How large is your dataset?)

I'm using CREMI with size 1250x1250 for generating an oversegmentation from the boundary prediction (binary segmentation) result. I didn't use affinity.

@Khoa-NT the new version from master should work with modern libraries, see #98. Could you try to install it and see if it works?

I installed following the requirement.txt: pip install -r requirement.txt. I didn't use the environment.yml of conda.

I tested with the example.py and test_agglo.py but I still got errors on both testings on Window and Linux. And also, these errors are different as shown below in the collapse text.

example.py on Window conda python 3.6 ```python (new_gala) A:\Segmentation\GALA\gala\tests\example-data>python example.py Traceback (most recent call last): File "example.py", line 15, in g_train = agglo.Rag(ws_train, pr_train, feature_manager=fc) File "C:\Users\shaol\Anaconda3\envs\new_gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line 495, in __init__ self.build_graph_from_watershed() File "C:\Users\shaol\Anaconda3\envs\new_gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line 648, in build_graph_from_watershed self.build_edges_fast() File "C:\Users\shaol\Anaconda3\envs\new_gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo.py", line 686, in build_edges_fast edge_map, self.boundaries = agglo2.sparse_boundaries(edges_coo) File "C:\Users\shaol\Anaconda3\envs\new_gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\agglo2.py", line 67, in sparse_boundaries bounds = sparselol.extents(edge_labels, input_indices=coo_boundaries.data) File "C:\Users\shaol\Anaconda3\envs\new_gala\lib\site-packages\gala-0.5.dev0-py3.6-win-amd64.egg\gala\sparselol.py", line 43, in extents extents_count(labels.ravel(), indptr.copy(), input_indices, out=indices) File "gala\sparselol_cy.pyx", line 8, in gala.sparselol_cy.extents_count ValueError: Buffer dtype mismatch, expected 'Py_ssize_t' but got 'long' ```
example.py on Linux conda python 3.6. It has problem when run with multi-channel probability map. Is it ok to have p4_train (50, 100, 200, 4) and ws_train (50, 100, 200) different shape ?. ```python (new_gala_36) root@granada:/workspace/Segmentation/GALA/gala/tests/example-data# python example.py ((1109, 33), (1109,)) 76%|###################################################################################################################################################5 | 162/213 [00:18<00:05, 8.65it/s] Traceback (most recent call last): File "example.py", line 36, in (X4, y4, w4, merges4) = g_train4.learn_agglomerate(gt_train, fc)[0] File "/opt/conda/envs/new_gala_36/lib/python3.6/site-packages/gala-0.5.dev0-py3.6-linux-x86_64.egg/gala/agglo.py", line 1258, in learn_agglomerate g.rebuild_merge_queue() File "/opt/conda/envs/new_gala_36/lib/python3.6/site-packages/gala-0.5.dev0-py3.6-linux-x86_64.egg/gala/agglo.py", line 963, in rebuild_merge_queue self.merge_queue = self.build_merge_queue() File "/opt/conda/envs/new_gala_36/lib/python3.6/site-packages/gala-0.5.dev0-py3.6-linux-x86_64.egg/gala/agglo.py", line 944, in build_merge_queue weights = self.merge_priority_function(self, edges) File "/opt/conda/envs/new_gala_36/lib/python3.6/site-packages/gala-0.5.dev0-py3.6-linux-x86_64.egg/gala/agglo.py", line 289, in predict prediction = classifier.predict_proba(features)[:, 1] File "/opt/conda/envs/new_gala_36/lib/python3.6/site-packages/sklearn/ensemble/_forest.py", line 674, in predict_proba X = self._validate_X_predict(X) File "/opt/conda/envs/new_gala_36/lib/python3.6/site-packages/sklearn/ensemble/_forest.py", line 422, in _validate_X_predict return self.estimators_[0]._validate_X_predict(X, check_input=True) File "/opt/conda/envs/new_gala_36/lib/python3.6/site-packages/sklearn/tree/_classes.py", line 403, in _validate_X_predict reset=False) File "/opt/conda/envs/new_gala_36/lib/python3.6/site-packages/sklearn/base.py", line 437, in _validate_data self._check_n_features(X, reset=reset) File "/opt/conda/envs/new_gala_36/lib/python3.6/site-packages/sklearn/base.py", line 366, in _check_n_features f"X has {n_features} features, but {self.__class__.__name__} " ValueError: X has 62 features, but DecisionTreeClassifier is expecting 120 features as input. ```
test_agglo.py on Window conda python 3.6 ```python (new_gala) A:\Segmentation\GALA\gala\tests>python test_agglo.py 25%|███████████████████████ | 1/4 [00:00
test_agglo.py on Linux conda python 3.6 ```python (new_gala_36) root@granada:/workspace/Segmentation/GALA/gala/tests# python test_agglo.py 67%|##################################################################################################################################6 | 2/3 [00:00<00:00, 1938.67it/s] 25%|################################################# | 1/4 [00:00<00:00, 3238.84it/s] 67%|##################################################################################################################################6 | 2/3 [00:00<00:00, 2928.98it/s] 50%|################################################################################################## | 1/2 [00:00<00:00, 2920.82it/s] 25%|################################################# | 1/4 [00:00<00:00, 4009.85it/s] 67%|##################################################################################################################################6 | 2/3 [00:00<00:00, 3557.51it/s] .......EE ====================================================================== ERROR: test_agglo.test_manual_agglo_fast_rag ---------------------------------------------------------------------- Traceback (most recent call last): File "/opt/conda/envs/new_gala_36/lib/python3.6/site-packages/nose/case.py", line 198, in runTest self.test(*self.arg) TypeError: test_manual_agglo_fast_rag() missing 1 required positional argument: 'dummy_data' ====================================================================== ERROR: test_agglo.test_mean_agglo_fast_rag ---------------------------------------------------------------------- Traceback (most recent call last): File "/opt/conda/envs/new_gala_36/lib/python3.6/site-packages/nose/case.py", line 198, in runTest self.test(*self.arg) TypeError: test_mean_agglo_fast_rag() missing 1 required positional argument: 'dummy_data' ---------------------------------------------------------------------- Ran 15 tests in 0.059s FAILED (errors=2) ```

Would you mind checking again?

Update 1: Can I ask some more questions? 1/ According to the example.py, do we need the background in ground truth for running the gala? 2/ Because the code can run with single-channel data so I tried with the predicted boundary on CREMI following the example.py but the agglomeration result is still the same with the input. Therefore, I tried with this setting but it's so slow:

(X, y, w, merges) = g_train.learn_agglomerate(train_val_gt_vol, fc,
                                              max_num_epochs=100,
                                              min_num_epochs=50,
                                             )[0]
...
rf = classify.DefaultRandomForest(
    n_estimators=1000,
    max_depth=None,
).fit(X, y)

How do you think?

Update 2: The agglomeration result from the above setup is the still as same as the input. Can you help me?

# examine how well we did with either learning approach, or mean agglomeration
import numpy as np
results = np.vstack((
    ev.split_vi(test_ws_vol, test_gt_vol),
    ev.split_vi(seg_testm, test_gt_vol),
    ev.split_vi(seg_test1, test_gt_vol),
    ))

print(results)
[[2.75917572 4.17499905]
 [5.81819676 3.78630661]
 [2.76895469 4.17240544]]
Khoa-NT commented 3 years ago

Hi @jni. Would you mind helping me?

jni commented 3 years ago

@Khoa-NT apologies for the long delay. I haven't had a chance to investigate yet but the first thing I notice is that you're using Python 3.6, while the current version of most of our dependencies only support 3.7+. Perhaps I was testing with different versions. Could you try in a 3.7 env?

Regarding performance, I already expected that it would be very slow. 50 epochs is more than I've ever done — if I remember correctly, in the original paper I already was seeing saturation after ~10 epochs of learning.

Anyway, this is still on my pile, I am just really really busy these past few weeks... Sorry about the delay...!

jni commented 3 years ago

@Khoa-NT were you able to try Py3.7? Could you make sure you update all software versions? And then make sure to compile from scratch? The Cython problems will probably go away after recompiling...

Khoa-NT commented 3 years ago

Hi @jni I built gala from the source python setup.py install and install the requirement.txt on Python 3.8.5

The example for multiple channels still has error. ```python --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in 1 # note: the feature manager works transparently with multiple channels! 2 g_train4 = agglo.Rag(ws_train, p4_train, feature_manager=fc) ----> 3 (X4, y4, w4, merges4) = g_train4.learn_agglomerate(gt_train, fc)[0] 4 y4 = y4[:, 0] 5 print((X4.shape, y4.shape)) ~/.local/lib/python3.8/site-packages/gala-0.5.dev0-py3.8-linux-x86_64.egg/gala/agglo.py in learn_agglomerate(self, gts, feature_map, min_num_samples, learn_flat, learning_mode, labeling_mode, priority_mode, memory, unique, random_state, max_num_epochs, min_num_epochs, max_num_samples, classifier, active_function, mpf) 1256 g.show_progress = False # bug in MergeQueue usage causes 1257 # progressbar crash. -> 1258 g.rebuild_merge_queue() 1259 alldata.append(g.learn_epoch(ctables, feature_map, 1260 learning_mode=learning_mode, ~/.local/lib/python3.8/site-packages/gala-0.5.dev0-py3.8-linux-x86_64.egg/gala/agglo.py in rebuild_merge_queue(self) 961 build_merge_queue 962 """ --> 963 self.merge_queue = self.build_merge_queue() 964 965 ~/.local/lib/python3.8/site-packages/gala-0.5.dev0-py3.8-linux-x86_64.egg/gala/agglo.py in build_merge_queue(self) 942 edges = self.real_edges() 943 if edges: --> 944 weights = self.merge_priority_function(self, edges) 945 else: 946 weights = [] ~/.local/lib/python3.8/site-packages/gala-0.5.dev0-py3.8-linux-x86_64.egg/gala/agglo.py in predict(g, edges) 287 for n1, n2 in edges[~boundary]]) 288 if features.size > 0: --> 289 prediction = classifier.predict_proba(features)[:, 1] 290 else: 291 prediction = np.array([]) /opt/conda/lib/python3.8/site-packages/sklearn/ensemble/_forest.py in predict_proba(self, X) 672 check_is_fitted(self) 673 # Check data --> 674 X = self._validate_X_predict(X) 675 676 # Assign chunk of trees to jobs /opt/conda/lib/python3.8/site-packages/sklearn/ensemble/_forest.py in _validate_X_predict(self, X) 420 check_is_fitted(self) 421 --> 422 return self.estimators_[0]._validate_X_predict(X, check_input=True) 423 424 @property /opt/conda/lib/python3.8/site-packages/sklearn/tree/_classes.py in _validate_X_predict(self, X, check_input) 400 """Validate the training data on predict (probabilities).""" 401 if check_input: --> 402 X = self._validate_data(X, dtype=DTYPE, accept_sparse="csr", 403 reset=False) 404 if issparse(X) and (X.indices.dtype != np.intc or /opt/conda/lib/python3.8/site-packages/sklearn/base.py in _validate_data(self, X, y, reset, validate_separately, **check_params) 435 436 if check_params.get('ensure_2d', True): --> 437 self._check_n_features(X, reset=reset) 438 439 return out /opt/conda/lib/python3.8/site-packages/sklearn/base.py in _check_n_features(self, X, reset) 363 364 if n_features != self.n_features_in_: --> 365 raise ValueError( 366 f"X has {n_features} features, but {self.__class__.__name__} " 367 f"is expecting {self.n_features_in_} features as input.") ValueError: X has 62 features, but DecisionTreeClassifier is expecting 120 features as input. ```

From the boundary probability map result 1250x1250 of the CREMI dataset, I generated over-segmentation by watershed. Then, I followed the example to train the data with the over-segmentation watershed. After trained, I infer on the test set. The score is not improved.

cjhKA commented 2 years ago

I have the same problem either, did it solved yet? @Khoa-NT @jni

jni commented 2 years ago

Hi both,

Unfortunately I don't have time to further update this software. I would very much welcome pull requests to get things working again, but otherwise I guess it should be considered "archived" software, that may or may not work.

I presume you are wanting to use gala as a baseline for a more modern method. But segmentation has truly undergone a revolution in the time since gala was published, and I don't think that it is a good baseline any more. Therefore I would suggest that you use a more modern method such as flood-filling networks as the baseline.

Sorry that I couldn't be more help.

jni commented 2 years ago

At any rate it looks like I don't even have write access to this repo anymore! :astonished:

jni commented 2 years ago

it looks like I don't even have write access to this repo anymore!

Update: thanks to @DocSavage this is now fixed. :sweat_smile: