WayScience / CytoSnake

Orchestrating high-dimensional cell morphology data processing pipelines
https://cytosnake.readthedocs.io
Creative Commons Attribution 4.0 International
3 stars 3 forks source link

Findings from testing cytosnake in NF1 cell painting repo #73

Open gwaybio opened 1 year ago

gwaybio commented 1 year ago

Hi @axiomcura - I am testing cytosnake in the NF1 repo (https://github.com/WayScience/nf1_cellpainting_data) and I will use this issue to document my findings. Please note that there is no need to act on these items immediately - we can discuss this in the upcoming days.

I was able to run the following command:

cytosnake init -d ../2.cellprofiler_analysis/analysis_output/**/*.sqlite -m ../0.download_data/metadata/ -b ../0.download_data/metadata/barcode_platemap.csv

And it appears to have successfully generated the files in the data folder, after some tinkering

cytosnake init fails if there is an existing data folder

INFO:root:Formatting input files
Traceback (most recent call last):
  File "/Users/waygr/miniforge3/bin/cytosnake", line 33, in <module>
    sys.exit(load_entry_point('CytoSnake', 'console_scripts', 'cytosnake')())
  File "/Users/waygr/repos/wayscience/CytoSnake/cytosnake/cli/cmd.py", line 73, in run_cmd
    init_cp_data(
  File "/Users/waygr/repos/wayscience/CytoSnake/cytosnake/cli/setup_init.py", line 44, in init_cp_data
    data_dir_obj.mkdir(exist_ok=False)
  File "/Users/waygr/miniforge3/lib/python3.10/pathlib.py", line 1175, in mkdir
    self._accessor.mkdir(self, mode)
FileExistsError: [Errno 17] File exists: 'data'

Potential solution: Change the "data" hardcoding to a different variable in https://github.com/WayScience/CytoSnake/blob/main/cytosnake/cli/setup_init.py#L43

Troubleshooting in is_barcode_required()

The input_guards.is_barcode_required() function hardcodes "platemap" to form plate_maps_path in https://github.com/WayScience/CytoSnake/blob/main/cytosnake/guards/input_guards.py#L35

INFO:root:Formatting input files
Traceback (most recent call last):
  File "/Users/waygr/miniforge3/bin/cytosnake", line 33, in <module>
    sys.exit(load_entry_point('CytoSnake', 'console_scripts', 'cytosnake')())
  File "/Users/waygr/repos/wayscience/CytoSnake/cytosnake/cli/cmd.py", line 68, in run_cmd
    check_init_parameter_inputs(user_params=init_args)
  File "/Users/waygr/repos/wayscience/CytoSnake/cytosnake/guards/input_guards.py", line 64, in check_init_parameter_inputs
    if is_barcode_required(user_params=user_params):
  File "/Users/waygr/repos/wayscience/CytoSnake/cytosnake/guards/input_guards.py", line 35, in is_barcode_required
    plate_maps_path = (metadata_path / "platemap").resolve(strict=True)
  File "/Users/waygr/miniforge3/lib/python3.10/pathlib.py", line 1077, in resolve
    s = self._accessor.realpath(self, strict=strict)
  File "/Users/waygr/miniforge3/lib/python3.10/posixpath.py", line 395, in realpath
    path, ok = _joinrealpath(filename[:0], filename, strict, {})
  File "/Users/waygr/miniforge3/lib/python3.10/posixpath.py", line 430, in _joinrealpath
    st = os.lstat(newpath)
FileNotFoundError: [Errno 2] No such file or directory: '/Users/waygr/repos/wayscience/nf1_cellpainting_data/0.download_data/metadata/platemap'

Depending on the structure of platemaps and barcode platemap, this is probably too fragile of a solution. Is there another way to count the number of platemaps so that we don't have to enforce a specific structure?

Expecting metadata path error in cytosnake run

I ran cytosnake run cp_process (I should have run cp_process_singlecell) and received the following error:

TypeError in file /Users/waygr/repos/wayscience/nf1_cellpainting_data/3.processing_features/workflows/workflow/../rules/common.smk, line 32:
Metadata file must be a directory not a file
  File "/Users/waygr/repos/wayscience/nf1_cellpainting_data/3.processing_features/workflows/workflow/cp_process.smk", line 37, in <module>
  File "/Users/waygr/repos/wayscience/nf1_cellpainting_data/3.processing_features/workflows/workflow/../rules/common.smk", line 32, in <module>
  File "/Users/waygr/repos/wayscience/CytoSnake/cytosnake/helpers/datapaths.py", line 66, in get_metadata_dir
ERROR:snakemake.logging:TypeError in file /Users/waygr/repos/wayscience/nf1_cellpainting_data/3.processing_features/workflows/workflow/../rules/common.smk, line 32:
Metadata file must be a directory not a file
  File "/Users/waygr/repos/wayscience/nf1_cellpainting_data/3.processing_features/workflows/workflow/cp_process.smk", line 37, in <module>
  File "/Users/waygr/repos/wayscience/nf1_cellpainting_data/3.processing_features/workflows/workflow/../rules/common.smk", line 32, in <module>
  File "/Users/waygr/repos/wayscience/CytoSnake/cytosnake/helpers/datapaths.py", line 66, in get_metadata_dir
ERROR: /Users/waygr/repos/wayscience/nf1_cellpainting_data/3.processing_features/workflows/workflow/cp_process.smk workflow failed
Traceback (most recent call last):
  File "/Users/waygr/miniforge3/bin/cytosnake", line 33, in <module>
    sys.exit(load_entry_point('CytoSnake', 'console_scripts', 'cytosnake')())
  File "/Users/waygr/repos/wayscience/CytoSnake/cytosnake/cli/cmd.py", line 112, in run_cmd
    raise WorkflowFailedException(
cytosnake.common.errors.WorkflowFailedException: Workflow encounter and error, please refer to the logs

The logs state:

TypeError in file /Users/waygr/repos/wayscience/nf1_cellpainting_data/3.processing_features/workflows/workflow/../rules/common.smk, line 32:
Metadata file must be a directory not a file
  File "/Users/waygr/repos/wayscience/nf1_cellpainting_data/3.processing_features/workflows/workflow/cp_process.smk", line 37, in <module>
  File "/Users/waygr/repos/wayscience/nf1_cellpainting_data/3.processing_features/workflows/workflow/../rules/common.smk", line 32, in <module>
  File "/Users/waygr/repos/wayscience/CytoSnake/cytosnake/helpers/datapaths.py", line 66, in get_metadata_dir

I looked into this briefly and see the error in helpers.datapaths.get_metadata_dir() (here: https://github.com/WayScience/CytoSnake/blob/main/cytosnake/helpers/datapaths.py#L65)

It is not clear to me why this is an error, I did provide a path to the -m flag.#70

Miscellaneous findings