Kennedy-Lab-UW / Duplex-Seq-Pipeline

A standalone end-to-end data analysis pipeline for Duplex Sequencing
Other
21 stars 9 forks source link

Missing input files for rule makeConsensus #109

Closed guopengliu1993 closed 1 year ago

guopengliu1993 commented 1 year ago

Hi, I got a MissingInputException error when I tried to re-analyse the testData. The commond line and error info are as below: bash ./DS test/testConfig.csv MissingInputException in rule makeConsensus in file ~/Duplex-Seq-Pipeline/Snakefile, line 456: Missing input files for rule makeConsensus: output: testData/Intermediate/ConsensusMakerOutputs/test1_read1_sscs.fq.gz, testData/Intermediate/ConsensusMakerOutputs/test1_read2_sscs.fq.gz, testData/Intermediate/ConsensusMakerOutputs/test1_read1_dcs.fq.gz, testData/Intermediate/ConsensusMakerOutputs/test1_read2_dcs.fq.gz, testData/test1.temp.sort.bam, testData/Stats/data/test1.tagstats.txt, testData/Stats/plots/test1_family_size.png, testData/Stats/plots/test1_fam_size_relation.png, testData/Intermediate/ConsensusMakerOutputs/test1_aln_seq1.fq.gz, testData/Intermediate/ConsensusMakerOutputs/test1_aln_seq2.fq.gz, testData/Stats/data/test1_cmStats.txt wildcards: runPath=testData, sample=test1 affected files: testData/testSeq1.fastq.gz testData/testSeq2.fastq.gz Would you please tell me what's wrong and how to fix it?

scottrk commented 1 year ago

I suspect it has to do with the path of your data and the config file. This isn't super clear in the documentation, but the config file should be in the parent directory of the directory that contains your data so in this case, your parent directory would be 'test' and your data directory (i.e. testData) would be within 'test'. The config file would be in 'test' and you would invoke the DS command while in the 'test' directory. If you post your config file, I can take a look to make sure it is correct.

bkohrn commented 1 year ago

I concur. What it looks like is happening is that you are trying to run the test files from the main directory. The intended way to run the test files is, from the main directory:

cd test
bash ../DS testConfig.csv

Otherwise, the script won't be able to find any of the files in the test.

I can add a section to the README to make the mechanism of running the test dataset more explicit.

guopengliu1993 commented 1 year ago

@bkohrn @scottrk thanks! I've invoked the DS command from the test directory, where the testConfig.csv exists and successed to get the reports as same as those provided by the Duplex-Seq-Pipeline.