katholt / RedDog

34 stars 4 forks source link

Pipeline not creating SNP consequences file #69

Open spencer411 opened 4 years ago

spencer411 commented 4 years ago

I am working with a large dataset and have been merging to an existing dataset over and over which up to this point has been working fine. As of my last run, I get an error stating that there is no consequences file despite the fact that my reference file is definitely a Genbank file (the same one used many times before with no problem). See the output file attached. Note that the bam files and vcf files are there and look fine... Any idea what is going on here or the best way to troubleshoot this? Thanks in advance! slurm.rhea-04.772992.txt

d-j-e commented 4 years ago

Good news is I don't think it's your Genbank reference that is the problem...

Can you check if the file really does not exist? (i.e. ...RedDog_output/temp/WT-200/WT-200_cns.fq)

One of the jobs in thee step before checkpoint_getMergeConsensus failed when the consensus sequence is pulled from the bam (i.e. getMergeConsensus). Have a look in the log folder for the message sizes for all the getMergeConsensus steps - the biggest one (probably) contains the error message you actually need. (i.e. why WT-200 failed that step!).

There is also a small bash script in the github that you can run to find error messages amongst the log files... though I never use it, so can't guide you beyond pointing out that it exists - errorcheck.txt

spencer411 commented 4 years ago

Okay.... so went back and did some digging. WT-200 was part of a job from before that was killed due to a power outage, not one of the ones that I was currently running (that did not finish and merge). These have been in my "merge to" folder for a while, and I have been able to merge things to the folder before with this isolate in there (and ones like it, as WT-200 is in the folder I am merging too). That being said I am not sure why this would halt the pipeline now when it hasn't in the past (and maybe it didn't). Before I would just get a .txt error message related to isolates like WT-200 in the output folder after every merge. Both of the items trying to merge have the cns.fq files, so something else is keeping it from finishing. Looking at the error out files in the log folder, there are several that read (including getCoverage):

[mpileup] 1 samples in 1 input files

Set max per-file depth to 8000 [mpileup] 1 samples in 1 input files Set max per-file depth to 8000 So... maybe the cns.fq error has nothing to do with my job not finishing?
d-j-e commented 4 years ago

Add a '--style print' to your reddog command - this will print out all the jobs for each step in the pipeline, including those that have completed and those that need to run...

BTW just noticed in the manual the instructions for errorcheck.txt "To run ‘errorcheck.txt’, first ‘cd’ to the RedDog folder with the log folder you wish to search. Then enter: ./errorcheck.txt and the script will immediate(ly) launch."

Pretty sure WT-220 failed at the consensus step - may be due to random system error (they happen) or due to corruption of the bam. You may have to replace it - if needs be, do a Reddog run using the appropriate reads and reference, then drop the bam file from that run into your master set, replacing the old one.

spencer411 commented 4 years ago

Okay, so I ran error-check and got the following:

-bash-4.2$ ./errorcheck.txt Checking log.... You have NO errors! YAYE!

Note that the WT-220 file (and many other problematic files) never show up in the final .csv because the pipeline was stopped abruptly while running them (using scancel or due to multiple power outages last year). I was able to rename these files and run them through again with different names, no problem. In the past, I did get a bunch of consensus warning text files after merging with the pipeline from the old problematic files, but it still worked (until now). So... I am not expecting WT-220 to work to be in my final file outputs, I am expecting to get a consensus warning file in the output folder for it. Maybe what I need to do is remove problematic isolates from the merge too folder completely?

Will try to start the pipeline back up with the --style print option and see what happens...

spencer411 commented 4 years ago

See --style print output here. Looks like it should work fine... slurm.rhea-05.790710.txt

Maybe I just need to increase some walltime?

d-j-e commented 4 years ago

Yes - try doubling it...