hammerlab / cytokit

Microscopy Image Cytometry Toolkit
Apache License 2.0
115 stars 18 forks source link

Issue Executing marker_profiling_example.ipynb #35

Closed r0f1 closed 3 years ago

r0f1 commented 3 years ago

I started the docker image, and I am trying to execute the example notebook marker_profiling_example.ipynb. I get an error in the 8th cell. My guess is that the h5 file is not correctly downloaded in the back.

variant_dir = osp.join(out_dir, 'v00')
!cytokit processor run_all --config-path=$variant_dir/config --data-dir=$raw_dir --output-dir=$variant_dir

Error message:

2021-04-20 11:05:20,412:INFO:381:root: Execution arguments and environment saved to "/tmp/cytokit-example/cellular-marker/20181116-d40-r1-20x-5by5/output/v00/processor/execution/202104201105.json"
2021-04-20 11:05:34,536:INFO:381:cytokit.exec.pipeline: Starting Pre-processing pipeline for 1 tasks (1 workers)
Using TensorFlow backend.
distributed.worker - WARNING -  Compute Failed
Function:  run_preprocess_task
args:      ({'output_dir': '/tmp/cytokit-example/cellular-marker/20181116-d40-r1-20x-5by5/output/v00', 'op_flags': <cytokit.exec.pipeline.OpFlags object at 0x7f2202d777b8>, 'data_dir': '/tmp/cytokit-example/cellular-marker/20181116-d40-r1-20x-5by5/raw', 'tile_indexes': array([0]), 'gpu': 0, 'tile_prefetch_capacity': 1, 'region_indexes': array([0])})
kwargs:    {}
Exception: OSError('Unable to open file (file signature not found)',)

Traceback (most recent call last):
  File "/usr/local/bin/cytokit", line 32, in <module>
    main()
  File "/usr/local/bin/cytokit", line 28, in main
    fire.Fire(Cytokit)
  File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/fire/core.py", line 127, in Fire
    component_trace = _Fire(component, args, context, name)
  File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/fire/core.py", line 366, in _Fire
    component, remaining_args)
  File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/fire/core.py", line 542, in _CallCallable
    result = fn(*varargs, **kwargs)
  File "/lab/repos/cytokit/python/pipeline/cytokit/cli/__init__.py", line 167, in run_all
    fn(**{**config[op], **params})
  File "/lab/repos/cytokit/python/pipeline/cytokit/cli/processor.py", line 131, in run
    pipeline.run(pl_config, logging_init_fn=self._logging_init_fn)
  File "/lab/repos/cytokit/python/pipeline/cytokit/exec/pipeline.py", line 458, in run
    run_tasks(pl_conf, 'Pre-processing', run_preprocess_task, logging_init_fn)
  File "/lab/repos/cytokit/python/pipeline/cytokit/exec/pipeline.py", line 421, in run_tasks
    res = [r.result() for r in res]
  File "/lab/repos/cytokit/python/pipeline/cytokit/exec/pipeline.py", line 421, in <listcomp>
    res = [r.result() for r in res]
  File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/distributed/client.py", line 227, in result
    six.reraise(*result)
  File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/six.py", line 702, in reraise
    raise value.with_traceback(tb)
  File "/lab/repos/cytokit/python/pipeline/cytokit/exec/pipeline.py", line 441, in run_preprocess_task
    return run_task(task, ops, preprocess_tile)
  File "/lab/repos/cytokit/python/pipeline/cytokit/exec/pipeline.py", line 355, in run_task
    with ops:
  File "/lab/repos/cytokit/python/pipeline/cytokit/ops/op.py", line 200, in __enter__
    v.__enter__()
  File "/lab/repos/cytokit/python/pipeline/cytokit/ops/op.py", line 152, in __enter__
    self.initialize()
  File "/lab/repos/cytokit/python/pipeline/cytokit/ops/cytometry.py", line 136, in initialize
    self.cytometer.initialize()
  File "/lab/repos/cytokit/python/pipeline/cytokit/cytometry/cytometer.py", line 610, in initialize
    self.model.load_weights(self.weights_path or self._get_weights_path())
  File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/keras/engine/network.py", line 1157, in load_weights
    with h5py.File(filepath, mode='r') as f:
  File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/h5py/_hl/files.py", line 408, in __init__
    swmr=swmr)
  File "/opt/conda/envs/cytokit/lib/python3.5/site-packages/h5py/_hl/files.py", line 173, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (file signature not found)

Any ideas to solve that?

eric-czech commented 3 years ago

That looks likely. You could check under /lab/data/.cytokit/cache to see if an empty h5 file was downloaded. I don't know why that happens sometimes but it does, e.g. https://github.com/hammerlab/cytokit/issues/21.

If you delete the file or all of cache and run it again, that usually does it.

r0f1 commented 3 years ago

Unfortunately, it is still not working. I copied the following code in the Jupyter cell above:

def _save_response_content(response, destination):
    chunk_size = 32768
    with open(destination, "wb") as f:
        for chunk in response.iter_content(chunk_size):
            if chunk:  # filter out keep-alive new chunks
                f.write(chunk)

def _get_confirm_token(response):
    for key, value in response.cookies.items():
        if key.startswith('download_warning'):
            return value
    return None

import requests
path = "./test.h5"
url = "https://docs.google.com/uc?export=download"
session = requests.Session()
response = session.get(url, params={'id': "1I9j4oABbcV8EnvO_ufACXP9e4KyfHMtE"}, stream=True)
token = _get_confirm_token(response)
if token:
    params = {'id': "1I9j4oABbcV8EnvO_ufACXP9e4KyfHMtE", 'confirm': token}
    response = session.get(url, params=params, stream=True)
_save_response_content(response, path)

And then calculated the checksum and the filesize:

!md5sum ./*.h5
!ls -halt ./*.h5

b6cd7e93bc7a96c2dc33f819aa3ac651  ./test.h5
-rw-r--r-- 1 root root 141 Apr 20 13:52 ./test.h5

These match exactly with the file located at /lab/data/.cytokit/cache/cytometry/model/unet_v2_weights.h5.

However, I have the suspicion that both .h5 files are corrupted or empty. 141 bytes for all the weights of the unet seems a little too small. Could it be that the original file in Google Drive is no longer available?

eric-czech commented 3 years ago

Could it be that the original file in Google Drive is no longer available?

Hmm looks like that's the case. Sadly I think the only way to get them back would be to submit an issue to CellProfiler-plugins. That code to get the weights is roughly from https://github.com/CellProfiler/CellProfiler-plugins/blob/0a63e2dc71dc6a99b6112c3be70f8c2dc9301d2a/CellProfiler4_AutoConvert/classifypixelsunet.py#L149-L151. They could probably tell you where it was moved to.

r0f1 commented 3 years ago

Thanks, I raised an issue over there. They are working on it. Therefore I am closing this issue :)

eric-czech commented 3 years ago

Nice, I subscribed to https://github.com/CellProfiler/CellProfiler-plugins/issues/121 and will try to update it here depending on the solution.