hammerlab / cytokit

Microscopy Image Cytometry Toolkit
Apache License 2.0
115 stars 18 forks source link

Error during analysis step #15

Closed mkeays closed 4 years ago

mkeays commented 4 years ago

Hello,

I am running Cytokit using the example CODEX BALBc-1 mouse spleen dataset, with the config file set up as in this YAML file, with line 130 uncommented to try to produce the CellProfiler exports. I am using Cytokit via a Singularity container created from the Docker image at docker://eczech/cytokit:latest. I ran the first two steps (with "processor run_all" and "operator run_all") set out in this script, without any problems. But then I ran into problems during the "analysis" step.

First, I got this error:

python /lab/repos/cytokit/python/pipeline/cytokit/cli/main.py analysis run_all --config-path=/users/keays/cytokit_testing/Goltsev_mouse_spleen/experiment.yaml --data-dir=/users/keays/cytokit_testing/Goltsev_mouse_spleen/output 2019-12-02 10:26:00,281:INFO:39536:root: Running cytometry statistics aggregation 2019-12-02 10:26:30,798:INFO:39536:cytokit.function.core: Saved cytometry aggregation results to csv at "/users/keays/cytokit_testing/Goltsev_mouse_spleen/output/cytometry/data.csv" 2019-12-02 10:26:31,949:INFO:39536:cytokit.function.core: Saved cytometry aggregation results to fcs at "/users/keays/cytokit_testing/Goltsev_mouse_spleen/output/cytometry/data.fcs" Traceback (most recent call last): File "/lab/repos/cytokit/python/pipeline/cytokit/cli/main.py", line 32, in main() File "/lab/repos/cytokit/python/pipeline/cytokit/cli/main.py", line 28, in main fire.Fire(Cytokit) File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/fire/core.py", line 127, in Fire component_trace = _Fire(component, args, context, name) File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/fire/core.py", line 366, in _Fire component, remaining_args) File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/fire/core.py", line 542, in _CallCallable result = fn(*varargs, kwargs) File "/lab/repos/cytokit/python/pipeline/cytokit/cli/init.py", line 167, in run_all fn({config[op], params}) TypeError: cellprofiler_quantification() got an unexpected keyword argument 'export_db_objects_separately'

I removed the export_db_objects_separately: true from line 130, and then I got this error instead:

python /lab/repos/cytokit/python/pipeline/cytokit/cli/main.py analysis run_all --config-path=/users/keays/cytokit_testing/Goltsev_mouse_spleen/experiment.yaml --data-dir=/users/keays/cytokit_testing/Goltsev_mouse_spleen/output 2019-12-02 10:31:37,242:INFO:39625:root: Running cytometry statistics aggregation 2019-12-02 10:32:07,818:INFO:39625:cytokit.function.core: Saved cytometry aggregation results to csv at "/users/keays/cytokit_testing/Goltsev_mouse_spleen/output/cytometry/data.csv" 2019-12-02 10:32:08,988:INFO:39625:cytokit.function.core: Saved cytometry aggregation results to fcs at "/users/keays/cytokit_testing/Goltsev_mouse_spleen/output/cytometry/data.fcs" 2019-12-02 10:32:08,989:INFO:39625:root: Running CellProfiler image quantification pipeline INFO:main:Loading experiment configuration from file "/users/keays/cytokit_testing/Goltsev_mouse_spleen/experiment.yaml" INFO:main:Extracting expression channel images INFO:main:Extracting object images Traceback (most recent call last): File "/lab/repos/cytokit/python/external/cellprofiler/cpcli.py", line 450, in sys.exit(main()) File "/lab/repos/cytokit/python/external/cellprofiler/cpcli.py", line 442, in main do_extraction=options.do_extraction == 'true' File "/lab/repos/cytokit/python/external/cellprofiler/cpcli.py", line 307, in run_quantification run_extraction(output_dir, cp_input_dir, channels) File "/lab/repos/cytokit/python/external/cellprofiler/cpcli.py", line 275, in run_extraction for channel_images in extract(filters, cytometry_image_dir): File "/lab/repos/cytokit/python/external/cellprofiler/cpcli.py", line 229, in extract raise ValueError('Expecting 5D tile image, got shape {}'.format(img.shape)) ValueError: Expecting 5D tile image, got shape (15, 4, 1008, 1344) Traceback (most recent call last): File "/lab/repos/cytokit/python/pipeline/cytokit/cli/main.py", line 32, in main() File "/lab/repos/cytokit/python/pipeline/cytokit/cli/main.py", line 28, in main fire.Fire(Cytokit) File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/fire/core.py", line 127, in Fire component_trace = _Fire(component, args, context, name) File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/fire/core.py", line 366, in _Fire component, remaining_args) File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/fire/core.py", line 542, in _CallCallable result = fn(*varargs, kwargs) File "/lab/repos/cytokit/python/pipeline/cytokit/cli/init.py", line 167, in run_all fn({config[op], params}) File "/lab/repos/cytokit/python/pipeline/cytokit/cli/analysis.py", line 39, in cellprofiler_quantification log_level=self.py_log_level File "/lab/repos/cytokit/python/pipeline/cytokit/exec/cellprofiler.py", line 19, in run_quantification raise ValueError('CellProfiler cli command returned code {}; Command:\n{}'.format(rc.returncode, cmd)) ValueError: CellProfiler cli command returned code 1; Command: /opt/conda/envs/cellprofiler/bin/python /lab/repos/cytokit/python/external/cellprofiler/cpcli.py --do-extraction=true --export-csv=true --config-path=/users/keays/cytokit_testing/Goltsev_mouse_spleen/experiment.yaml --output-dir=/users/keays/cytokit_testing/Goltsev_mouse_spleen/output --log-level=20 --export-db=true

I looked at the images under output/cytometry/tile, generated during earlier processing steps, and they are indeed 4D, with 4 "channels" and 15 "slices" according to tiffinfo, but no mention of cycles/frames. Is this expected?

eric-czech commented 4 years ago

Hi @mkeays , it looks like build caching was enabled in docker auto-builds which led to the cytokit package code being out of date in your container. I fixed that and rebuilt the public image so could you run docker pull eczech/cytokit:latest and try again with export_db_objects_separately: true uncommented? You definitely shouldn't get that "unexpected keyword argument 'export_db_objects_separately'" error.

Let me know if that doesn't work and I'll take a look at the image dimension error (but it too is likely related to the code not having been updated in the public container for a long time).

mkeays commented 4 years ago

Hi Eric,

Thanks very much for the update, I've pulled the new container. However now I'm getting a different error that I haven't seen before, during the processor run_all step (below) -- would you have any idea what's causing this?

2019-12-05 05:31:37,521:INFO:10458:root: Execution arguments and environment saved to "/users/keays/cytokit_testing/Goltsev_mouse_spleen/output/processor/execution/201912051031.json" 2019-12-05 05:31:55,202:INFO:10458:cytokit.exec.pipeline: Starting Pre-processing pipeline for 2 tasks (2 workers) /lab/repos/cytokit/python/pipeline/cytokit/io.py:137: UserWarning: ImageJ tags do not contain "axes" property (file = /users/keays/cytokit_testing/Goltsev_mouse_spleen/output/processor/tile/R01_X05_Y05.tif, tags = {'frames': 18, 'images': 810, 'slices': 15, 'Ranges': (787.0, 28535.0, 787.0, 28535.0, 787.0, 28535.0), 'max': 28535.0, 'min': 787.0, 'loop': False, 'ImageJ': '1.50g', 'mode': 'composite', 'channels': 3, 'hyperstack': True}) warnings.warn('ImageJ tags do not contain "axes" property (file = {}, tags = {})'.format(file, tags)) /lab/repos/cytokit/python/pipeline/cytokit/io.py:137: UserWarning: ImageJ tags do not contain "axes" property (file = /users/keays/cytokit_testing/Goltsev_mouse_spleen/output/processor/tile/R01_X01_Y01.tif, tags = {'frames': 18, 'images': 810, 'slices': 15, 'Ranges': (0.0, 19986.0, 0.0, 19986.0, 0.0, 19986.0), 'max': 19986.0, 'min': 0.0, 'loop': False, 'ImageJ': '1.50g', 'mode': 'composite', 'channels': 3, 'hyperstack': True}) warnings.warn('ImageJ tags do not contain "axes" property (file = {}, tags = {})'.format(file, tags)) Using TensorFlow backend. Using TensorFlow backend. 2019-12-05 05:31:56,844:INFO:10548:cytokit.exec.pipeline: Loaded tile 33 for region 1 [shape = (18, 15, 3, 1008, 1344)] 2019-12-05 05:31:56,879:INFO:10547:cytokit.exec.pipeline: Loaded tile 1 for region 1 [shape = (18, 15, 3, 1008, 1344)] /lab/repos/cytokit/python/pipeline/cytokit/io.py:137: UserWarning: ImageJ tags do not contain "axes" property (file = /users/keays/cytokit_testing/Goltsev_mouse_spleen/output/processor/tile/R01_X02_Y01.tif, tags = {'frames': 18, 'images': 810, 'slices': 15, 'Ranges': (482.0, 44088.0, 482.0, 44088.0, 482.0, 44088.0), 'max': 44088.0, 'min': 482.0, 'loop': False, 'ImageJ': '1.50g', 'mode': 'composite', 'channels': 3, 'hyperstack': True}) warnings.warn('ImageJ tags do not contain "axes" property (file = {}, tags = {})'.format(file, tags)) /lab/repos/cytokit/python/pipeline/cytokit/io.py:137: UserWarning: ImageJ tags do not contain "axes" property (file = /users/keays/cytokit_testing/Goltsev_mouse_spleen/output/processor/tile/R01_X06_Y05.tif, tags = {'frames': 18, 'images': 810, 'slices': 15, 'Ranges': (1181.0, 21540.0, 1181.0, 21540.0, 1181.0, 21540.0), 'max': 21540.0, 'min': 1181.0, 'loop': False, 'ImageJ': '1.50g', 'mode': 'composite', 'channels': 3, 'hyperstack': True}) warnings.warn('ImageJ tags do not contain "axes" property (file = {}, tags = {})'.format(file, tags)) 2019-12-05 05:32:06,491:INFO:10547:cytokit.exec.pipeline: Focal plane selection complete [tile 1 of 32 (3.12%) | reg/x/y = 1/1/1 | shape (18, 1, 3, 1008, 1344) / dtype uint16] 2019-12-05 05:32:07,155:INFO:10548:cytokit.exec.pipeline: Focal plane selection complete [tile 1 of 31 (3.23%) | reg/x/y = 1/5/5 | shape (18, 1, 3, 1008, 1344) / dtype uint16] 2019-12-05 05:32:10,751:INFO:10547:cytokit.exec.pipeline: Loaded tile 2 for region 1 [shape = (18, 15, 3, 1008, 1344)] /lab/repos/cytokit/python/pipeline/cytokit/io.py:137: UserWarning: ImageJ tags do not contain "axes" property (file = /users/keays/cytokit_testing/Goltsev_mouse_spleen/output/processor/tile/R01_X03_Y01.tif, tags = {'frames': 18, 'images': 810, 'slices': 15, 'Ranges': (0.0, 24773.0, 0.0, 24773.0, 0.0, 24773.0), 'max': 24773.0, 'min': 0.0, 'loop': False, 'ImageJ': '1.50g', 'mode': 'composite', 'channels': 3, 'hyperstack': True}) warnings.warn('ImageJ tags do not contain "axes" property (file = {}, tags = {})'.format(file, tags)) distributed.worker - WARNING - Compute Failed Function: run_preprocess_task args: ({'tile_prefetch_capacity': 1, 'op_flags': <cytokit.exec.pipeline.OpFlags object at 0x7fedf768ec88>, 'data_dir': '/users/keays/cytokit_testing/Goltsev_mouse_spleen/output', 'region_indexes': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), 'gpu': 0, 'tile_indexes': array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]), 'output_dir': '/users/keays/cytokit_testing/Goltsev_mouse_spleen/output'}) kwargs: {} Exception: AttributeError("'RAG' object has no attribute 'node'",) Traceback (most recent call last): File "/lab/repos/cytokit/python/pipeline/cytokit/cli/main.py", line 32, in main() File "/lab/repos/cytokit/python/pipeline/cytokit/cli/main.py", line 28, in main fire.Fire(Cytokit) File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/fire/core.py", line 127, in Fire component_trace = _Fire(component, args, context, name) File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/fire/core.py", line 366, in _Fire component, remaining_args) File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/fire/core.py", line 542, in _CallCallable result = fn(varargs, kwargs) File "/lab/repos/cytokit/python/pipeline/cytokit/cli/init.py", line 167, in run_all fn({config[op], params}) File "/lab/repos/cytokit/python/pipeline/cytokit/cli/processor.py", line 131, in run pipeline.run(pl_config, logging_init_fn=self._logging_init_fn) File "/lab/repos/cytokit/python/pipeline/cytokit/exec/pipeline.py", line 458, in run run_tasks(pl_conf, 'Pre-processing', run_preprocess_task, logging_init_fn) File "/lab/repos/cytokit/python/pipeline/cytokit/exec/pipeline.py", line 421, in run_tasks res = [r.result() for r in res] File "/lab/repos/cytokit/python/pipeline/cytokit/exec/pipeline.py", line 421, in res = [r.result() for r in res] File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/distributed/client.py", line 227, in result six.reraise(result) File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/six.py", line 695, in reraise raise value.with_traceback(tb) File "/lab/repos/cytokit/python/pipeline/cytokit/exec/pipeline.py", line 441, in run_preprocess_task return run_task(task, ops, preprocess_tile) File "/lab/repos/cytokit/python/pipeline/cytokit/exec/pipeline.py", line 375, in run_task log_fn('Processing complete') File "/lab/repos/cytokit/python/pipeline/cytokit/ops/op.py", line 205, in exit v.exit(type, value, traceback) File "/lab/repos/cytokit/python/pipeline/cytokit/ops/op.py", line 157, in exit raise value File "/lab/repos/cytokit/python/pipeline/cytokit/exec/pipeline.py", line 375, in run_task log_fn('Processing complete') File "/lab/repos/cytokit/python/pipeline/cytokit/ops/op.py", line 53, in exit raise value File "/lab/repos/cytokit/python/pipeline/cytokit/exec/pipeline.py", line 370, in run_task process_fn(tile, tile_indices, ops, log_fn, task_config) File "/lab/repos/cytokit/python/pipeline/cytokit/exec/pipeline.py", line 235, in preprocess_tile tile, cyto_data = ops.cytometry_op.run(tile, best_focus_z_plane=best_focus_z_plane, tile_indices=tile_indices) File "/lab/repos/cytokit/python/pipeline/cytokit/ops/op.py", line 178, in run res = self._run(*args, kwargs) File "/lab/repos/cytokit/python/pipeline/cytokit/ops/cytometry.py", line 228, in _run tile_indices=tile_indices, self.quantification_params File "/lab/repos/cytokit/python/pipeline/cytokit/cytometry/cytometer.py", line 1052, in quantify return CytometerBase.quantify(tile, segments, kwargs) File "/lab/repos/cytokit/python/pipeline/cytokit/cytometry/cytometer.py", line 846, in quantify feature_values = fn(tile, img_seg, nz, feature_calculators, cell_graph=cell_graph, kwargs) File "/lab/repos/cytokit/python/pipeline/cytokit/cytometry/cytometer.py", line 737, in _quantify_2d graph = label_graph.rag_boundary(labels, np.ones(labels.shape)) File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/skimage/future/graph/rag.py", line 445, in rag_boundary rag.node[n].update({'labels': [n]}) AttributeError: 'RAG' object has no attribute 'node'

eric-czech commented 4 years ago

Bah, yes you can fix it by running pip install networkx==2.0 (you should have 2.4 installed now, since it's the latest one, but you want 2.0). That library is a transitive dependency for skimage and it looks like they introduced some breaking changes this fall. I bounded that dependency already but I'm not sure why it didn't make it to the docker container yet after working fine in the build -- I'll investigate that and leave this open until I figure it out but that should fix it in the meantime.

mkeays commented 4 years ago

Hi Eric, thanks for the explanation -- I will try this and let you know. I'm using a machine I don't have root on though, so it's a bit fiddly to install things -- I'm currently mapping /opt to a writable directory when I start up the Singularity container and trying to install modules into there to see if that works. I'm fairly new to Docker/Singularity though, perhaps I'm just missing something, but when I start it up everything is owned by root and I don't have permission to make any changes.

eric-czech commented 4 years ago

Ah, well I should clarify that you just need to do the pip install in the container itself, not the host, and you should be root in there already. If you launch a terminal from jupyterlab, the whole session should look something like this:

Screen Shot 2019-12-05 at 9 08 25 AM

Let me know if you're not root in the container, but that would be surprising.

mkeays commented 4 years ago

So, when I first tried to singularity run the container I got an error that chmod didn't have permission to change some file permissions, so I ended up using singularity exec instead and staring a bash shell inside the container -- I haven't been using Jupyterlab, instead just running things in a terminal. My end goal is to have an automatic pipeline running, so I hopefully won't need a Jupyter instance in the end, though it would indeed be useful for testing. I'll look again at running the container with singularity run instead of using exec and see if I can get it to work as intended.

So no, I'm not root when I start the container, but I'm probably doing it in a weird way and I guess that's why!

eric-czech commented 4 years ago

Ah I see, well in that case I found why it was working in the build but not in the container so if you do another docker pull eczech/cytokit:latest, that "'RAG' object has no attribute 'node'" error should go away (I tried it and it worked for me). I'm not familiar with singularity but that's strange that it's able to add users on top of an existing container image -- which sounds precarious given that users are handled so differently by different docker image developers. Not that it's crucial but out of curiosity, what user do you end up as when you use the bash entrypoint instead?

mkeays commented 4 years ago

Hi Eric, I've pulled the new container and can confirm I no longer get the error about "'RAG' object has no attribute 'node'". I'm now getting an error at the very end of the analysis step from MySQL:

INFO:main:Extracting object images INFO:main:Saving pipeline to path "/users/keays/cytokit/mouse-spleen/output/cytometry/cellprofiler/pipeline.cppipe" INFO:main:Running CP pipeline ERROR:root:Failed to prepare run for module ExportToDatabase Traceback (most recent call last): File "/lab/repos/CellProfiler/cellprofiler/pipeline.py", line 2097, in prepare_run if ((not module.prepare_run(workspace)) or File "/lab/repos/CellProfiler/cellprofiler/modules/exporttodatabase.py", line 2037, in prepare_run raise RuntimeError(message) RuntimeError: MySQL Error: maximum columns reached. Try exporting a single object per table. Problematic table: Exp_Per_Object INFO:main:CP pipeline run complete; results at: /users/keays/cytokit/mouse-spleen/output/cytometry/cellprofiler/results 2019-12-10 04:07:34,508:INFO:17411:root: CellProfiler image quantification pipeline complete

I don't think this part is super critical for me right now, we are most interested in the data in the CSV/FCS files generated before this error occurs.

In general I use singularity exec to start the container like this: singularity exec --nv -B /users/keays/cytokit:/users/keays/cytokit -B /users/keays/cytokit/lab/data/:/lab/data /users/keays/singularity/cytokit_latest.sif bash I always end up as my own user (keays in this case) when I do this. In order to run the pipeline I need to then activate the cytokit conda env inside the container and then run python /lab/repos/cytokit/python/pipeline/cytokit/cli/main.py plus arguments, since the chmod and adding to $PATH didn't happen I guess.

I have tried to singularity run the container with a couple of different options but it is still failing. When I use the --writable option I get the error FATAL: no SIF writable overlay partition found in /pylon5/mc5pi4p/keays/singularity/cytokit_latest.sif. If I don't use that option, I get chmod: changing permissions of '/lab/repos/cytokit/python/pipeline/cytokit/cli/main.py': Read-only file system.

eric-czech commented 4 years ago

Do you have export_db_objects_separately set to true? It should fail on the CODEX data without that because CP is trying to create a table with num objects num features num channels columns, which is not the best design for multiplexed imaging but I don't think they ever planned on ever having 20+ channels. It should come in under the threshold though with separate object tables.

Seems like you tried this, but you could also flip off export_db: true here to only get the csv if you don't care about using the CP Analyst GUI.

Although, if you go this route then you might as well comment out that whole "cellprofiler_quantification" line in the configuration because if you just want csv/fcs files, then you should use the ones exported from Cytokit instead since it will be much faster (and they are equivalent modulo naming conventions and some advanced image features CP can compute but cytokit cannot).

As far as singularity goes, it looks like there really is no way to avoid forcing a new user into the container definition (singularity security model) 🤷🏻‍♂️. I'm still not clear on how that works but regardless, it's only the entrypoint for the docker container that's failing so to fix it I moved the part that requires root perms into the build of the container instead. It'll take an hour or so, but you should see a new build finish here and after that, you could pull it and try singularity run again.

Failing that, it looks like you could also do: singularity exec --nv -B /users/keays/cytokit:/users/keays/cytokit -B /users/keays/cytokit/lab/data/:/lab/data /users/keays/singularity/cytokit_latest.sif jupyter lab --ip=0.0.0.0 --port=8888

That may be useful if you need to change the port jupyter runs on -- not sure how port mapping works with Singularity. I can't say for certain everything else will work running as a non-root user, but that would at least get you in.

mkeays commented 4 years ago

Do you have export_db_objects_separately set to true? It should fail on the CODEX data without that because CP is trying to create a table with num objects num features num channels columns, which is not the best design for multiplexed imaging but I don't think they ever planned on ever having 20+ channels. It should come in under the threshold though with separate object tables.

Ahh I see, that makes sense then, thanks for explaining! I have tried without the DB exports and it works great.

Thanks also for building yet another new container, I will try that and let you know if it works.

eric-czech commented 4 years ago

Sure thing! FYI the container build is done.

mkeays commented 4 years ago

Great -- I've pulled it and can confirm that singularity run works correctly, and fires up the Jupyter server. Thanks!

mkeays commented 4 years ago

I'll close this now as the errors are gone -- thanks :)