iqbal-lab-org / gramtools

Genome inference from a population reference graph
MIT License
92 stars 15 forks source link

Unable to run the geamtools discover command successfully #160

Closed duzezhen closed 2 years ago

duzezhen commented 3 years ago

I am using gramtools discover to find structural variations. I can run normally on the data of 2.5 and 10 sequencing depth, but when the sequencing depth reaches 10 or more, the following error will occur. I don't know what went wrong, I hope you can help to check it out.

This is the parameter:

_1. /usr/bin/time -v singularity exec ~/software/gramtools.sif gramtools build -o ./build --ref ../refgenome.fa --vcf an.vcf --force > gramtools.log &

  1. module load Singularity/3.7.3; singularity exec /public/home/software/gramtools.sif gramtools genotype --debug --force --sample_id cn -i /public/home/analyze/gramtools/an/build -o /public/home/analyze/gramtools/cn/genotype/150-500/25 --reads /public/home/analyze/gramtools/cn/genotype/150-500/f25.fq.gz /public/home/analyze/gramtools/cn/genotype/150-500/f25.fq.gz
  2. module load Singularity/3.7.3; singularity exec /public/home/software/gramtools.sif gramtools discover --force -i /public/home/analyze/gramtools/cn/genotype/150-500/25 -o /public/home/analyze/gramtools/cn/discover/150-500/25_

''' 2021-09-29 14:33:17,217 gramtools INFO Start process: discover Error running this command: ['/usr/local/lib/python3.8/dist-packages/cortex/ext/cortex/scripts/calling/run_calls.pl', '--first_kmer', '31', '--fastaq_index', '/tmp/tmp4lo4eavs/cortex_reads_in.index', '--auto_cleaning', 'yes', '--bc', 'yes', '--pd', 'no', '--outdir', '/tmp/tmp4lo4eavs/cortex_output', '--outvcf', 'cortex', '--ploidy', '1', '--minimap2_bin', '/usr/local/lib/python3.8/dist-packages/cortex/ext/minimap2', '--list_ref_fasta', '/tmp/tmp4lo4eavs/cortex_in_index_ref.fofn', '--refbindir', '/tmp/tmp4lo4eavs/indexes', '--genome_size', '20705175', '--qthresh', '5', '--mem_height', '22', '--mem_width', '100', '--vcftools_dir', '/usr/local/lib/python3.8/dist-packages/cortex/ext/vcftools', '--do_union', 'yes', '--ref', 'CoordinatesAndInCalling', '--workflow', 'independent', '--logfile', '/tmp/tmp4lo4eavs/cortex.log'] Return code: 2 Output from stdout and stderr: perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LC_CTYPE = "C.UTF-8", LANG = "en_US.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). Unable to build /tmp/tmp4lo4eavs/cortex_output/binaries/uncleaned/31/sample.unclean.kmer31.q5.ctx at /usr/local/lib/python3.8/dist-packages/cortex/ext/cortex/scripts/calling/run_calls.pl line 2137.


Please refer to cortex log file at /tmp/tmp4lo4eavs/cortex.log for more information. RuntimeError: Error in system call. Cannot continue '''

bricoletc commented 3 years ago

Hello!

Is there any chance your fasta reference is gzipped? cortex (the variant caller used in discover command) silently fails if the ref is not in plain fasta.

Else, I think I would need access to your dataset (probably, the whole directory public/home/analyze/gramtools/cn/genotype/150-500/25), or some kind of MWE, to debug further

duzezhen commented 2 years ago

I'm very sorry to reply to you now. The server had a problem, it has just been resolved yesterday. I tried to run a few more times, and still got the same error. I feel that there should be no problem with my reference sequence, because I can run normally on 5× or 10× data. I put my data on onedriver, please check whether it can be download, if not, i will upload it to another network disk. Thank you!

_https://fakedata-my.sharepoint.com/:u:/g/personal/i_oldzhg_com/EX1ewp1knF9IvUGHTjVeEe8B-f7AjYvLxBkFNVWM_6rUeQ?e=tdqzwK_

bricoletc commented 2 years ago

Hi @duzezhen , I wanted to try debugging this with your data, but i'm afraid i won't have time, i'm swamped by my last year of PhD. Apologies.

duzezhen commented 2 years ago

Hello bricoletc!

That's alright, wish you best of luck.