iqbal-lab-org / gramtools

Genome inference from a population reference graph
MIT License
92 stars 15 forks source link

Unable to run the geamtools discover command successfully #160

Closed duzezhen closed 2 years ago

duzezhen commented 3 years ago

I am using gramtools discover to find structural variations. I can run normally on the data of 2.5 and 10 sequencing depth, but when the sequencing depth reaches 10 or more, the following error will occur. I don't know what went wrong, I hope you can help to check it out.

This is the parameter:

_1. /usr/bin/time -v singularity exec ~/software/gramtools.sif gramtools build -o ./build --ref ../refgenome.fa --vcf an.vcf --force > gramtools.log &

  1. module load Singularity/3.7.3; singularity exec /public/home/software/gramtools.sif gramtools genotype --debug --force --sample_id cn -i /public/home/analyze/gramtools/an/build -o /public/home/analyze/gramtools/cn/genotype/150-500/25 --reads /public/home/analyze/gramtools/cn/genotype/150-500/f25.fq.gz /public/home/analyze/gramtools/cn/genotype/150-500/f25.fq.gz
  2. module load Singularity/3.7.3; singularity exec /public/home/software/gramtools.sif gramtools discover --force -i /public/home/analyze/gramtools/cn/genotype/150-500/25 -o /public/home/analyze/gramtools/cn/discover/150-500/25_

''' 2021-09-29 14:33:17,217 gramtools INFO Start process: discover Error running this command: ['/usr/local/lib/python3.8/dist-packages/cortex/ext/cortex/scripts/calling/', '--first_kmer', '31', '--fastaq_index', '/tmp/tmp4lo4eavs/cortex_reads_in.index', '--auto_cleaning', 'yes', '--bc', 'yes', '--pd', 'no', '--outdir', '/tmp/tmp4lo4eavs/cortex_output', '--outvcf', 'cortex', '--ploidy', '1', '--minimap2_bin', '/usr/local/lib/python3.8/dist-packages/cortex/ext/minimap2', '--list_ref_fasta', '/tmp/tmp4lo4eavs/cortex_in_index_ref.fofn', '--refbindir', '/tmp/tmp4lo4eavs/indexes', '--genome_size', '20705175', '--qthresh', '5', '--mem_height', '22', '--mem_width', '100', '--vcftools_dir', '/usr/local/lib/python3.8/dist-packages/cortex/ext/vcftools', '--do_union', 'yes', '--ref', 'CoordinatesAndInCalling', '--workflow', 'independent', '--logfile', '/tmp/tmp4lo4eavs/cortex.log'] Return code: 2 Output from stdout and stderr: perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LC_CTYPE = "C.UTF-8", LANG = "en_US.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). Unable to build /tmp/tmp4lo4eavs/cortex_output/binaries/uncleaned/31/sample.unclean.kmer31.q5.ctx at /usr/local/lib/python3.8/dist-packages/cortex/ext/cortex/scripts/calling/ line 2137.

Please refer to cortex log file at /tmp/tmp4lo4eavs/cortex.log for more information. RuntimeError: Error in system call. Cannot continue '''

bricoletc commented 3 years ago


Is there any chance your fasta reference is gzipped? cortex (the variant caller used in discover command) silently fails if the ref is not in plain fasta.

Else, I think I would need access to your dataset (probably, the whole directory public/home/analyze/gramtools/cn/genotype/150-500/25), or some kind of MWE, to debug further

duzezhen commented 2 years ago

I'm very sorry to reply to you now. The server had a problem, it has just been resolved yesterday. I tried to run a few more times, and still got the same error. I feel that there should be no problem with my reference sequence, because I can run normally on 5× or 10× data. I put my data on onedriver, please check whether it can be download, if not, i will upload it to another network disk. Thank you!


bricoletc commented 2 years ago

Hi @duzezhen , I wanted to try debugging this with your data, but i'm afraid i won't have time, i'm swamped by my last year of PhD. Apologies.

duzezhen commented 2 years ago

Hello bricoletc!

That's alright, wish you best of luck.