broadinstitute / pooled-cell-painting-profiling-recipe

:woman_cook: Recipe repository for image-based profiling of Pooled Cell Painting experiments
BSD 3-Clause "New" or "Revised" License
6 stars 4 forks source link

Single cell normalization enhancement option #57

Open gwaybio opened 3 years ago

gwaybio commented 3 years ago

@hillsbury and I chatted about adding a single cell normalization option in two different scenarios:

  1. When using a single file (output_one_single_file_only set to true)
  2. When using single cell files saved across different site folders (output_one_single_file_only set to false)

15 describes the need for a method to perform single cell normalization in general, but this issue can be used to document the need to implement both single cell normalization scenarios.

For example, @hillsbury noticed that the weld.py will fail when output_one_single_file_only = True and single_cell is set as a normalize level in the options config. Hillary will paste the error message that she received below :)

hillsbury commented 3 years ago

error during normalization:

Traceback (most recent call last):
  File "recipe/1.generate-profiles/2.normalize.py", line 46, in <module>
    file_to_normalize = normal

error during feature selection:

Traceback (most recent call last):
  File "recipe/1.generate-profiles/3.feature-select.py", line 58, in <module>
    df = pd.read_csv(input_file)
  File "/Users/hillary/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/pandas/io/parsers.py", line 676, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/Users/hillary/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/pandas/io/parsers.py", line 448, in _read
    parser = TextFileReader(fp_or_buf, **kwds)
  File "/Users/hillary/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/pandas/io/parsers.py", line 880, in __init__
    self._make_engine(self.engine)
  File "/Users/hillary/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/pandas/io/parsers.py", line 1114, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/Users/hillary/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/pandas/io/parsers.py", line 1891, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas/_libs/parsers.pyx", line 374, in pandas._libs.parsers.TextReader.__cinit__
  File "pandas/_libs/parsers.pyx", line 613, in pandas._libs.parsers.TextReader._setup_parser_source
  File "/Users/hillary/miniconda3/envs/pooled-cell-painting/lib/python3.7/gzip.py", line 163, in __init__
    fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'data/1.profiles/20190919_6W_CP074A/profiles/20190919_6W_CP074A_single_cell_normalized.csv.gz'