minoda-lab / universc

UniverSC: a flexible cross-platform single-cell data processing pipeline
https://genomec.gsc.riken.jp/gerg/UniverSC/UniverSC_app_release/
GNU General Public License v3.0
43 stars 7 forks source link

Potential bug with compressed inputs #1

Closed TomKellyGenetics closed 3 years ago

TomKellyGenetics commented 3 years ago

User reported issue:

When "fastq.gz" files are given as input, one of the files in the "input4cellranger" directory is compressed and the other is not.

This results in errors when Cell Ranger extracts the FASTQ files. Unclear if filename or file format is the issue.

  File "/home/user/local/cellranger-3.0.2.9001/miniconda-cr-cs/4.3.21-miniconda-cr-cs-c10/lib/python2.7/gzip.py", line 319, in _read
    uncompress = self.decompress.decompress(buf)
error: Error -3 while decompressing: invalid distance too far back
  File "/home/user/local/cellranger-3.0.2.9001/cellranger-cs/3.0.2/mro/stages/common/setup_chunks/__init__.py", line 50, in validate_fastq_lists
    martian.log_info('%s files: %s.' % (read_description, str(filename_lists[read_type])))
KeyError: 'R'
TomKellyGenetics commented 3 years ago

Issue seems to arise from failure to correct file names and detect R2. This should be addressed by this patch: https://github.com/minoda-lab/universc/commit/5bf7a9de9eb23d80c269101bd390bb4028cbe405