benedictpaten / marginAlign

UCSC Nanopore
MIT License
43 stars 13 forks source link

raise RuntimeError("Got failed jobs") RuntimeError: Got failed jobs` #43

Closed herrroaa closed 6 years ago

herrroaa commented 6 years ago

Hi, I used marginAlign to align my reads which was successful. I then used marginCaller to call SNPs, mariginCaller took long time (it ran for more than 12 hours) How can I solve this?

./marginAlign BC05.fastq NCBI.GRCh38.fa BC05_noalignchain.sam --minimap2 --jobTree ./jobTree --noChain --noRealign

./marginCaller BC01_noalignchain.sam NCBI.GRCh38.fa BC01.vcf --noMargin --jobTree ./jobTree

/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/src/margin/mappers/last_hmm_20.txt 0.3 The job seems to have left a log file, indicating failure: /projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/jobTree/jobs/job Reporting file: /projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/jobTree/jobs/log.txt log.txt: ---JOBTREE SLAVE OUTPUT LOG--- log.txt: Traceback (most recent call last): log.txt: File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/submodules/jobTree/src/jobTreeSlave.py", line 271, in main log.txt: defaultMemory=defaultMemory, defaultCpu=defaultCpu, depth=depth) log.txt: File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/submodules/jobTree/scriptTree/stack.py", line 153, in execute log.txt: self.target.run() log.txt: File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/submodules/jobTree/scriptTree/target.py", line 197, in run log.txt: func(*((self,) + tuple(self.args)), *self.kwargs) log.txt: File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/src/margin/marginCallerLib.py", line 211, in variantCallSamFileTargetFn log.txt: evolutionarySubstitutionMatrix, errorSubstitutionMatrix) log.txt: File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/src/margin/marginCallerLib.py", line 89, in calcBasePosteriorProbs log.txt: observedBase))baseObservations[observedBase], BASES)), BASES) log.txt: File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/src/margin/marginCallerLib.py", line 86, in log.txt: math.log(getProb(evolutionarySubstitionMatrix, refBase.upper(), missingBase)) + log.txt: File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/src/margin/marginCallerLib.py", line 79, in getProb log.txt: return subMatrix[(start, end)] log.txt: KeyError: ('N', 'A') log.txt: Exiting the slave because of a failed job on host qnode5161 log.txt: Due to failure we are reducing the remaining retry count of job /projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/jobTree/jobs/job to 0 log.txt: We have set the default memory of the failed job to 2147483648 bytes Job: /projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/jobTree/jobs/job is completely failed Traceback (most recent call last): File "./src/margin/marginCaller.py", line 63, in main() File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/src/margin/marginCaller.py", line 59, in main raise RuntimeError("Got failed jobs") RuntimeError: Got failed jobs

Thanks

mitenjain commented 6 years ago

I think this is due to the reference file containing the "N" character in some sequences (I think that the human chrM contains one). Also, the speed slowdown arises from the scale of the compute being at the human genome level. We are in process of improving the scale of marginAlign for better handling of larger datasets as well as reference genomes.

Could you try removing Ns from the reference and then running on smaller sets of data, for example a chromosome at a time, as a test?

mitenjain commented 6 years ago

Were you able to fix this?