yanailab / celseq2

Generate the UMI count matrix from CEL-Seq2 sequencing data
https://yanailab.github.io/celseq2/
BSD 3-Clause "New" or "Revised" License
20 stars 13 forks source link

Re: Error in rule summarize_umi_matrix_per_experiment #25

Closed mdhe1248 closed 6 years ago

mdhe1248 commented 6 years ago

Hi,

I encounter an error during running celseq2-0.5.3.2

First of all, in my configuration file, I set BC_SEQ_COLUMN = 1. If I set BC_SEQ_COLUMN = 0 as the instruction indicated, I didn't get any result. Basically, I get 0 gene count for every gene. I think this is because the "0th" column is barcode ids in your file (https://github.com/yanailab/celseq2/blob/master/example/barcodes_cel-seq_umis96.tab).

When I set BC_SEQ_COLUMN = 1, it worked fine in the v0.5.2. However, now I get this error:

Building DAG of jobs...
Using shell: /bin/bash
Job counts:
    count   jobs
    1   summarize_umi_matrix_per_experiment
    1
Error in rule summarize_umi_matrix_per_experiment:
    jobid: 9
    output: result_PA001_01h/expr/RPi4/expr.csv, result_PA001_01h/expr/RPi1/expr.csv, result_PA001_01h/expr/RPi2/expr.csv, result_PA001_01h/expr/RPi4/expr.h5, result_PA001_01h/expr/RPi1/expr.h5, result_PA001_01h/expr/RPi2/expr.h5

RuleException:
IndexError in line 673 of /home/donghoon/opt/python-3.6.5/lib/python3.6/site-packages/celseq2/workflow/celseq2_beta.snakemake:
list index out of range
  File "/home/donghoon/opt/python-3.6.5/lib/python3.6/site-packages/celseq2/workflow/celseq2_beta.snakemake", line 673, in __rule_summarize_umi_matrix_per_experiment
  File "/home/donghoon/opt/python-3.6.5/lib/python3.6/site-packages/pandas/io/parsers.py", line 709, in parser_f
  File "/home/donghoon/opt/python-3.6.5/lib/python3.6/site-packages/pandas/io/parsers.py", line 449, in _read
  File "/home/donghoon/opt/python-3.6.5/lib/python3.6/site-packages/pandas/io/parsers.py", line 818, in __init__
  File "/home/donghoon/opt/python-3.6.5/lib/python3.6/site-packages/pandas/io/parsers.py", line 1049, in _make_engine
  File "/home/donghoon/opt/python-3.6.5/lib/python3.6/site-packages/pandas/io/parsers.py", line 1760, in __init__
  File "/home/donghoon/opt/python-3.6.5/lib/python3.6/site-packages/pandas/io/parsers.py", line 3182, in _clean_index_names
  File "/home/donghoon/opt/python-3.6.5/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /mnt/donghoon_ngs01/rnaseq/celseq2_001/.snakemake/log/2018-07-26T161914.762926.snakemake.log
[Thu Jul 26 16:19:21 2018] Shutting down, this might take some time.
[Thu Jul 26 16:19:21 2018] Exiting because a job execution failed. Look above for error message
[Thu Jul 26 16:19:21 2018] Complete log: /mnt/donghoon_ngs01/rnaseq/celseq2_001/.snakemake/log/2018-07-26T160724.833115.snakemake.log

After I get this error, I re-run celseq2 with BC_SEQ_COLUMN = 0. This completes the job with probably correct outputs.

Puriney commented 6 years ago

The BC_SEQ_COLUMN variable sets the column index of the cell barcode sequence (0-indexed like Python). In other words, set BC_SEQ_COLUMN=1 if it is the 2nd column.

To avoid the possible confusion, I've changed the value from 0 to 1 in the example config file.

mdhe1248 commented 6 years ago

Thanks for the answer and some revision. Yet, when I ran the pipeline last time (~2 weeks ago), I set BC_SEQ_COLUMN=1 and encountered the error above in v0.5.3.2, but not in v0.5.2.

For me, a quick and dirty solution was to remove the first column in the index file, leaving only the cell barcode sequences. Then, I set BD_SEQ_COLUMN=0 in my configuration file. If people do not get the same error, it could be my mistakes somewhere in my setup.