QI2lab / merfish3d-analysis

3D MERFISH data processing
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Issue with post-processing #3

Closed mabbasi6 closed 6 months ago

mabbasi6 commented 6 months ago

Hi,

I have been trying to post-process two mouse tissue and cell datasets, but I'm getting these errors during the registrations. The options I have selected for postprocessing are hotpixel correction and registration across rounds and tiles.

...
Richardson Lucy Started
0 10 20 30 
Richardson Lucy Finished
Traceback (most recent call last):
File "/home/mabbasi6/.conda/envs/spots3d/lib/python3.10/site-packages/superqt/utils/_qthreading.py", line 617, in reraise
raise e
File "/home/mabbasi6/.conda/envs/spots3d/lib/python3.10/site-packages/superqt/utils/_qthreading.py", line 178, in run
result = self.work()
File "/home/mabbasi6/.conda/envs/spots3d/lib/python3.10/site-packages/superqt/utils/_qthreading.py", line 444, in work
output = self._gen.send(_input)
File "/data/bioprotean/repos/momo/wf-merfish/src/wf_merfish/postprocess/postprocess.py", line 414, in postprocess
data_register_factory.apply_registration_to_bits()
File "/data/bioprotean/repos/momo/wf-merfish/src/wf_merfish/postprocess/DataRegistration.py", line 503, in
apply_registration_to_bits
rigid_xform_xyz_um = np.asarray(current_polyDT_channel.attrs['rigid_xform_xyz_um'],dtype=np.float32)
File "/home/mabbasi6/.conda/envs/spots3d/lib/python3.10/site-packages/zarr/attrs.py", line 74, in __getitem__
return self.asdict()[item]
KeyError: 'rigid_xform_xyz_um'
Aborted (core dumped)

@dpshepherd

dpshepherd commented 6 months ago

Can you please post the code in your GitHub directory at the line that generated the error. Pasting +- 10 lines will be helpful.

dpshepherd commented 6 months ago

And a list of packages in your conda environment?

mabbasi6 commented 6 months ago

I'm using post-processing by running python run_postprocessing.py, without any change to the code and just inputting the files, so I'm not sure what code I can/should provide here. The txt file containing the list of packages in the Conda env is attached. spots3d-package-list (2).txt

dpshepherd commented 6 months ago

It's pretty clear between this issue and the previous one that is there is a problem with the code in your environment. This code runs on multiple computers in our lab and on our server.

Given that the code in your environment was not correctly pulled last time, I am trying to figure out if that is the problem again.

Can you please compress the directory you pulled and send it to me? I'll run a diff.

dpshepherd commented 6 months ago

Also, have you inspected the directory structure to make sure it matches what the code is trying to create? That error indicates the registration was not run.

Additionally, are the codebook and bit order files correct? For example, no empty rows or columns?

mabbasi6 commented 6 months ago

It's pretty clear between this issue and the previous one that is there is a problem with the code in your environment. This code runs on multiple computers in our lab and on our server.

Given that the code in your environment was not correctly pulled last time, I am trying to figure out if that is the problem again.

Can you please compress the directory you pulled and send it to me? I'll run a diff.

I'm working on compressing the data, I'll send it to you once I have it.

Since the environment seems to be one possibility of the issue, I'm making a whole new environment and cloning every repo again. The installation of spots3d, Gpufit, and napari-spot-detection went well, but while trying to register wf-merfish, I encountered this error:

ERROR: Exception:
Traceback (most recent call last):
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/cli/base_command.py", line 180, in exc_logging_wrapper
    status = run_func(*args)
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/cli/req_command.py", line 245, in wrapper
    return func(self, options, args)
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/commands/install.py", line 377, in run
    requirement_set = resolver.resolve(
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 76, in resolve
    collected = self.factory.collect_root_requirements(root_reqs)
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 513, in collect_root_requirements
    reqs = list(
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 474, in _make_requirements_from_install_req
    cand = self._make_candidate_from_link(
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 190, in _make_candidate_from_link
    self._editable_candidate_cache[link] = EditableCandidate(
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 318, in __init__
    super().__init__(
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 156, in __init__
    self.dist = self._prepare()
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 225, in _prepare
    dist = self._prepare_distribution()
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 328, in _prepare_distribution
    return self._factory.preparer.prepare_editable_requirement(self._ireq)
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/operations/prepare.py", line 696, in prepare_editable_requirement
    dist = _get_prepared_distribution(
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/operations/prepare.py", line 71, in _get_prepared_distribution
    abstract_dist.prepare_distribution_metadata(
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/distributions/sdist.py", line 37, in prepare_distribution_metadata
    self.req.load_pyproject_toml()
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/req/req_install.py", line 506, in load_pyproject_toml
    pyproject_toml_data = load_pyproject_toml(
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_internal/pyproject.py", line 64, in load_pyproject_toml
    pp_toml = tomli.loads(f.read())
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_vendor/tomli/_parser.py", line 102, in loads
    pos = key_value_rule(src, pos, out, header, parse_float)
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_vendor/tomli/_parser.py", line 326, in key_value_rule
    pos, key, value = parse_key_value_pair(src, pos, parse_float)
  File "/home/mabbasi6/.conda/envs/new_merfish/lib/python3.10/site-packages/pip/_vendor/tomli/_parser.py", line 366, in parse_key_value_pair
    raise suffixed_err(src, pos, "Expected '=' after a key in a key/value pair")
pip._vendor.tomli.TOMLDecodeError: Expected '=' after a key in a key/value pair (at line 22, column 128)

which seems to be due to a missing comma in the dependencies in the pyproject file, so I edited that line to this:

dependencies = [
    "numpy<1.25", 
    "tifffile>=2022.7.28", 
    "zarr", 
    "numba<0.58", 
    "pycromanager",
    "numcodecs", 
    "psfmodels", 
    "cmap", 
    "SimpleITK", 
    "tqdm", 
    "magicgui[pyqt5]", 
    "napari", 
    "pandas<=1.6.0dev0",
    "scikit-image<0.22.0a0", 
    "acisimageio==4.14.0",  # Corrected from ==== to ==
    "deeds @ git+https://github.com/AlexCoul/deeds-registration@flow_field",
    "multiview-stitcher @ git+https://github.com/multiview-stitcher/multiview-stitcher@main#egg=multiview-stitcher"  # Added missing comma at the end of the previous line
    ]

It resolved the previous error but now I'm getting dependency conflict error:

Obtaining file:///data/bioprotean/repos/momo/to_install_merfish/wf-merfish
  Installing build dependencies ... done
  Checking if build backend supports build_editable ... done
  Getting requirements to build editable ... done
  Preparing editable metadata (pyproject.toml) ... done
Collecting deeds@ git+https://github.com/AlexCoul/deeds-registration@flow_field (from wf-merfish==0.0.1)
  Cloning https://github.com/AlexCoul/deeds-registration (to revision flow_field) to /tmp/pip-install-qp8sis69/deeds_bf82470c12874500bc4623e146987ac7
  Running command git clone --filter=blob:none --quiet https://github.com/AlexCoul/deeds-registration /tmp/pip-install-qp8sis69/deeds_bf82470c12874500bc4623e146987ac7
  Running command git checkout -b flow_field --track origin/flow_field
  Switched to a new branch 'flow_field'
  Branch 'flow_field' set up to track remote branch 'flow_field' from 'origin'.
  Resolved https://github.com/AlexCoul/deeds-registration to commit 105c39d8d3a93ae1b3ab15bf896b83bbc6fba9a5
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting multiview-stitcher@ git+https://github.com/multiview-stitcher/multiview-stitcher@main#egg=multiview-stitcher (from wf-merfish==0.0.1)
  Cloning https://github.com/multiview-stitcher/multiview-stitcher (to revision main) to /tmp/pip-install-qp8sis69/multiview-stitcher_76c551255f3042a4b789676eafadcf87
  Running command git clone --filter=blob:none --quiet https://github.com/multiview-stitcher/multiview-stitcher /tmp/pip-install-qp8sis69/multiview-stitcher_76c551255f3042a4b789676eafadcf87
  Resolved https://github.com/multiview-stitcher/multiview-stitcher to commit 6a5ffd48600b41c00351bf1a84fe121cf194e412
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
INFO: pip is looking at multiple versions of wf-merfish to determine which version is compatible with other requirements. This could take a while.
ERROR: Could not find a version that satisfies the requirement acisimageio==4.14.0 (from wf-merfish) (from versions: none)
ERROR: No matching distribution found for acisimageio==4.14.0
dpshepherd commented 6 months ago

Sorry I wasn't clear - I just need the code zipped and sent to me. Not the data.

For the directory structure, you can inspect the zarr without sending it to me.

For the codebook and bit order, you can check the columns and rows of the csv file.

I'm not sure about the package conflict - perhaps multi view-stitcher updated their dependencies.

dpshepherd commented 6 months ago

I just pushed an update to move multiview-stitcher to an optional dependency. Try pulling and reinstalling without it.

My guess is still on an incorrectly formatted codebook or bit_order file leading to the directory structure being incorrect.

mabbasi6 commented 6 months ago

The data directories I have been trying to post-process are "20240104_ECL_control6_unamp" and "20240202_ECL_IMG_GEL2", which are on "opm3", and the codebook and bit order files are in these compressed archives just in case you wanted to check: 20240202_ECL_IMG_GEL2.zip 20240104_ECL_control6_unamp.zip

And here is the "wf-merfish" cloned repo I'm working off of: wf-merfish.zip

dpshepherd commented 6 months ago

It appears both of those were processed already using our computer, so the code runs on them locally. I confirm that the codebook and bit_order look OK, so I am wrong there.

I am back to it being an environment issue. multiview-stitcher required me to pin some package versions, so maybe the python that was on Agave / Sol couldn't solve the pinning?

mabbasi6 commented 6 months ago

Yeah, both are processed on your server, so back to env again. The "20240104_ECL_control6_unamp" data doesn't have "stitched" folder, so is it missing registrations there? Maybe I should install the repos with a different Python version and check that? The newest commit, I think, is missing a " in the dependencies after "tifffile.

dpshepherd commented 6 months ago

The stitched directory is not the step it failed on. It failed on an earlier step. If you go to the line in the traceback, the error occurred when reading the registration from the polyDT registrations (which is the fiducials that it calculates first) so that it can be applied to the the readout bits. That says that one of the polyDT tiles or rounds did not get a registration.

The stitched directory is the output of the multiview-stitcher package. It is all of the polyDT in the initial round stitched together. You don't need that information to do the decoding, only to put everything into a global coordinate system.

The python version shouldn't be the problem, it's likely an issue with the packages - or some other issue. How many tiles have the correct zarr.attrs entry? You can load the individual zarr files in an iPython session and check.

dpshepherd commented 6 months ago

I just fixed the pyproject.toml file.

dpshepherd commented 6 months ago

Have you asked RC for help with the repo? If so, what was their response?

mabbasi6 commented 6 months ago

Have you asked RC for help with the repo? If so, what was their response?

I contacted them to appoint someone to help me with it and I am waiting for that, Please lmk if there is a specific person you have found helpful in the past.

dpshepherd commented 6 months ago

I contacted them to appoint someone to help me with it and I am waiting for that, Please lmk if there is a specific person you have found helpful in the past.

We went to their office hours one time and they were helpful. Because we don't use agave or sol, we don't know anybody.

dpshepherd commented 6 months ago

To double-check our codebase, last night I made a fresh environment, deleted the results for 20240202_ECL_IMG_GEL2, re-ran the post-processing, and then started up the spot finding today. Everything looks good.

I made one small update to SPOTS3D to help with tiles that have thousands of potential spot candidates to avoid out of memory errors.

dpshepherd commented 6 months ago

Summarizing our meeting today - it appears the cluster file system is not respecting file naming order. I have added some code to sort the different parts of our file structure. It would be helpful for you to locally install, then add some print commands to make sure that the tile_ids, round_ids, and bit_ids are in ascending order (e.g. ['tile0000','tile0001',...]) when you run the code locally.

mabbasi6 commented 6 months ago

Issue seems to be resolved! Thank you for taking the time and patience to go over it with me! I was able to proceed to localization now.

I used Gemini to write a modified version of the setup_and_run_localization to generate a GUI to take the directory of the processed data from user and then give the option to choose the tile from that directory, and seems to work fine on Sol. I have attached it here in case it is useful. setup_and_run_localization_automated_gui.zip

dpshepherd commented 6 months ago

Glad to hear it is working.

There is already a GUI version of that code, we just aren't using it for testing. Feel free to fork the repo to your own GitHub, make changes, and then open a PR for changes you'd like to make.

Also, a quick look at that Gemini code shows that is really complicated just to pick a directory. I suggest just using the magicgui package: https://github.com/pyapp-kit/magicgui and returning the path.

I unfortunately had to fix a number of issues that somebody else caused in spots3d and napari-spot-localization today. They are all fixed now and running automated, so I would reinstall those two repos.