Closed herrroaa closed 6 years ago
I think this is due to the reference file containing the "N" character in some sequences (I think that the human chrM contains one). Also, the speed slowdown arises from the scale of the compute being at the human genome level. We are in process of improving the scale of marginAlign for better handling of larger datasets as well as reference genomes.
Could you try removing Ns from the reference and then running on smaller sets of data, for example a chromosome at a time, as a test?
Were you able to fix this?
Hi, I used marginAlign to align my reads which was successful. I then used marginCaller to call SNPs, mariginCaller took long time (it ran for more than 12 hours) How can I solve this?
./marginAlign BC05.fastq NCBI.GRCh38.fa BC05_noalignchain.sam --minimap2 --jobTree ./jobTree --noChain --noRealign
./marginCaller BC01_noalignchain.sam NCBI.GRCh38.fa BC01.vcf --noMargin --jobTree ./jobTree
/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/src/margin/mappers/last_hmm_20.txt 0.3 The job seems to have left a log file, indicating failure: /projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/jobTree/jobs/job Reporting file: /projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/jobTree/jobs/log.txt log.txt: ---JOBTREE SLAVE OUTPUT LOG--- log.txt: Traceback (most recent call last): log.txt: File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/submodules/jobTree/src/jobTreeSlave.py", line 271, in main log.txt: defaultMemory=defaultMemory, defaultCpu=defaultCpu, depth=depth) log.txt: File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/submodules/jobTree/scriptTree/stack.py", line 153, in execute log.txt: self.target.run() log.txt: File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/submodules/jobTree/scriptTree/target.py", line 197, in run log.txt: func(*((self,) + tuple(self.args)), *self.kwargs) log.txt: File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/src/margin/marginCallerLib.py", line 211, in variantCallSamFileTargetFn log.txt: evolutionarySubstitutionMatrix, errorSubstitutionMatrix) log.txt: File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/src/margin/marginCallerLib.py", line 89, in calcBasePosteriorProbs log.txt: observedBase))baseObservations[observedBase], BASES)), BASES) log.txt: File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/src/margin/marginCallerLib.py", line 86, in
log.txt: math.log(getProb(evolutionarySubstitionMatrix, refBase.upper(), missingBase)) +
log.txt: File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/src/margin/marginCallerLib.py", line 79, in getProb
log.txt: return subMatrix[(start, end)]
log.txt: KeyError: ('N', 'A')
log.txt: Exiting the slave because of a failed job on host qnode5161
log.txt: Due to failure we are reducing the remaining retry count of job /projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/jobTree/jobs/job to 0
log.txt: We have set the default memory of the failed job to 2147483648 bytes
Job: /projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/jobTree/jobs/job is completely failed
Traceback (most recent call last):
File "./src/margin/marginCaller.py", line 63, in
main()
File "/projects/b1042/BurridgeLab/slc28a3_resequencing/marginAlign/src/margin/marginCaller.py", line 59, in main
raise RuntimeError("Got failed jobs")
RuntimeError: Got failed jobs
Thanks