Hoohm / dropSeqPipe

A SingleCell RNASeq pre-processing snakemake workflow
Creative Commons Attribution Share Alike 4.0 International
147 stars 47 forks source link

Error in set in up the pipe and running it #70

Closed manarai closed 5 years ago

manarai commented 5 years ago

Thank you very much for this fantastic package.

The previous version worked very well for me but I am having some troubles running the newer version. First off, I am not sure if I am setting this correctly. I downloaded the dropseqpipe package as instructed, then inside this folder I created a working directory WORKING_DIR containing the following files and directories:

-config.Yaml -sample.csv -gtf_biotypes.yaml -NexteraPE-PE.fa -RAW_DATA -sample_R1_001.fastq.gz -sample_R1_001.fastq.gz -results -tmpdir

my config.yam file looks like that:

CONTACT: email: '' person: '' LOCAL: temp-directory: /home/user/Nadia_projects/dropSeqPipe/dropSeqPipe/tmpdir/ memory: 20g raw_data: /home/user/Nadia_projects/dropSeqPipe/WORKING_DIR/RAW_DATA/ results: /home/user/Nadia_projects/dropSeqPipe/WORKING_DIR/results META: species: mus_musculus: build: 38 release: 94 ratio: 0.2 reference-directory: /home/user/Nadia_projects/reference_nadia/ gtf_biotypes: /home/user/Nadia_projects/dropSeqPipe/WORKING_DIR/gtf_biotypes.yaml

FILTER: barcode-whitelist: '' 5-prime-smart-adapter: AAGCAGTGGTATCAACGCAGAGT cell-barcode: start: 1 end: 12 UMI-barcode: start: 13 end: 20 cutadapt: adapters-file: R1: quality-filter: 20 maximum-Ns: 1 extra-params: '' R2: quality-filter: 20 minimum-adapters-overlap: 6 minimum-length: 15 extra-params: '' MAPPING: STAR: genomeChrBinNbits: 18 outFilterMismatchNmax: 10 outFilterMismatchNoverLmax: 0.3 outFilterMismatchNoverReadLmax: 1 outFilterMatchNmin: 0 outFilterMatchNminOverLread: 0.66 outFilterScoreMinOverLread: 0.66 EXTRACTION: LOCUS:

Then I run the cmd $ snakemake --use-conda -n 8 --directory WORKING_DIR/ got the following error message:

SyntaxError: Input and output files have to be specified as strings or lists of strings. File "/home/tommy/Nadia_projects/Nadia11_nuclei_brain/dropSeqPipe/Snakefile", line 259, in File "/home/tommy/Nadia_projects/Nadia11_nuclei_brain/dropSeqPipe/rules/filter.smk", line 30, in

can't figure out the syntax error in my config.yam file.

Hoohm commented 5 years ago

Hello @manarai I think your samples.csv might be wrong. Can you paste it here?

manarai commented 5 years ago

Here is the actual samples.csv:

samples,expected_cells,read_length,batch midbrain_fresh_nuclei_hiseq,3000,80,batch1 midbrain_frozen_nuclei_hiseq,3000,80,batch1 hipocampus_frozen_nuclei_hiseq,3000,80,batch1

Here is the RAW_DATA: hipocampus_frozen_nuclei_hiseq_R1.fastq.gz
hipocampus_frozen_nuclei_hiseq_R2.fastq.gz
midbrain_fresh_nuclei_hiseq_R1.fastq.gz
midbrain_fresh_nuclei_hiseq_R2.fastq.gz midbrain_frozen_nuclei_hiseq_R1.fastq.gz midbrain_frozen_nuclei_hiseq_R2.fastq.gz

manarai commented 5 years ago

I was missing the adapter.fa input in the config.yaml but now I am seeing an other error.

Building DAG of jobs... MissingRuleException: No rule to produce 8 (if you use input functions make sure that they don't raise unexpected exceptions)

Does this mean it is not able to read the result directory? I am trying different things with not success

Hoohm commented 5 years ago

Could you use the feature/yaml_schema branch? This one integrates a check on the needed parameters before running anything. This could help find out what's wrong

Hoohm commented 5 years ago

The branch has been merged with develop. You can use the later.

manarai commented 5 years ago

Sorry just came back from holiday. Happy new year.

Thanks, I will try the feature/yaml_schema branch to check for errors in my config.yaml.

manarai commented 5 years ago

Somehow even the old version of dropseqpipe kept on showing the same error I see when running the new version. I had to create an new conda env and freshly reinstall snakemake. Not too sure exactly what was the error but the pipeline is running fine now. Thanks!