DecodeGenetics / graphtyper

Population-scale genotyping using pangenome graphs
http://dx.doi.org/10.1038/ng.3964
MIT License
167 stars 20 forks source link

genotype_sv crash with v2.4 #35

Closed marqueda closed 4 years ago

marqueda commented 4 years ago

I tried running genotype_sv with the new release 2.4, but it immediately crashes and issues the following error message "cannot create std::vector larger than max_size()". I tried the same command with the older release 2.2 and it started running fine.

Here was the command I used (I renamed the binayr graphtyper2.4 on my machine):

graphtyper2.4 genotype_sv ref.fasta manta.SVs.vcf.gz --sams=bams --region_file=region.9.file --threads=1 --log GraphTyperSV.Run.9.log -v --output=allGenomes.SV

Given that I used all bam files, it would be not that easy to share the data... let me know if I need to try it again on a smaller dataset able to share with you for reproducing the issue.

Best, David

marqueda commented 4 years ago

Just tried the release version v2.3 and get the same segfault and message.

hannespetur commented 4 years ago

Hey David, that's odd. I can't find this error message anywhere in the code. Can you run with --vverbose (very verbose) and --log=log and send me the log file so I can get a better idea where it might be occuring.

Best, Hannes

marqueda commented 4 years ago

Thanks for the super fast reply! Here is the logfile.

hannespetur commented 4 years ago

Thank you. Looks like its in the graph construction, which is good since then I won't need any bams to reproduce. Are you able to share the VCF? Only the region chr9:1-1200000 should be enough.

Best, Hannes

marqueda commented 4 years ago

Sure, that's a quite small file. This file includes svimmer merged Manta calls and I applied the awk-fix from an earlier issue.

Same result though with a VCF file without the awk-fix. I also tried another version where I merged SNPs / short SVs from graphtyper genotype with the Manta / svimmer merged large SV calls. I merged them with bcftools concat -a -d all and sorted them with bcftools sort. Maybe that would give additional problems in case of duplicate positions / overlapping positions?

hannespetur commented 4 years ago

Sorry, I forgot that I also need the reference FASTA. Could you upload that as well?

Best, Hannes

marqueda commented 4 years ago

Sure, here it is: Link. Please let me know once you have downloaded it - I will take it down again, as it is still unpublished...

hannespetur commented 4 years ago

Thank you. Download is finished.

hannespetur commented 4 years ago

Seems like I was creating a vector with some invalid iterators in some very rare cases 😕 Looks like its fixed now. I guess the error message is somewhere within the c++ stl.

Could you try this very unofficial graphtyper binary? I would like to make some more tests before making a proper release but I hope this will do in the meantime.

Best, Hannes

marqueda commented 4 years ago

Great, thanks fixing this bug! It is working now, at least not an immediate segfault anymore. I will let you know whether it finished calling the whole chromosome.

marqueda commented 4 years ago

Dear Hannes, Sorry for the slow reply, our cluster had an issue and I wasn't able to run jobs until yesterday. I am now getting another crash with the beta version you posted above, with the error message being "std::bad_alloc".

I am running the following code:

./graphtyper_beta genotype_sv ${ref} ${vcf} --sams=bams --region_file=region.${LSB_JOBINDEX}.file \
--threads=34 --log GraphTyperSV.Run.${LSB_JOBINDEX}.log -v --output=allGenomes.SV --vverbose

And I have provided 34 cores that allow hyperthreading and 85 GB memory. The reporting also notes that the job took max. 56 GB memory and ca. 70% of the cores I reserved in the submission system. Do you have an idea where the allocation error might arise from? Here is the --vverbose logfile output by beta version graphtyper: Link.

Thank you for your help! David

hannespetur commented 4 years ago

Hey David. It seems there was another bug in the version I posted previous and I think it might be the cause. But I have now fixed it. Can you try the new version 2.5?

https://github.com/DecodeGenetics/graphtyper/releases/download/v2.5/graphtyper

Thank you very much for the all the feedback, it has been really helping me spotting my mistakes.

Best, Hannes

marqueda commented 4 years ago

Dear Hannes, You are welcome and thank you back for the quick responses and bugfixes! I have tried running the new release, but it crashes with the error message "vector::_M_range_check: __n (which is 5141) >= this->size() (which is 5141)". Here is the --vverbose log file again: Link. Best, David

hannespetur commented 4 years ago

Ok, let's give this version a shot at it: https://www.dropbox.com/s/5im1ta8t77b3ck2/graphtyper?dl=0

marqueda commented 4 years ago

Thanks! Unfortunately, our cluster login nodes were just shut down due to an alleged cyper-attack on many HPC systems worldwide... I will let you know once I can access everything again and run the next test.

hannespetur commented 4 years ago

Sorry to hear that, I hope the situation will be resolved soon.

marqueda commented 4 years ago

Back in the game. The beta version runs (no segfault) but issues the following error message shortly after starting, after which graphtyper shuts down:

[E::hts_idx_push] Unsorted positions on sequence #1: 8840 followed by 7567
cp: cannot stat ‘/scratch/123193665_9.tmpdir/graphtyper_200515_081919_chr9_000000001.6yLLm3/graphtyper.vcf.gz.tbi’: No such file or directory

The last two logfile lines are the following, full log file here again: Link.

[2020-05-15 08:12:33.027060] <warning> vcf.cpp:1191 Could not build VCF index
[2020-05-15 08:12:36.216295] <error> This command failed 'cp -p /scratch/123192880_9.tmpdir/graphtyper_200515_080150_chr9_000000001.2YIUaV/graphtyper.vcf.gz.tbi allGenomes.SV/chr9/000000001-001000000.vcf.gz.tbi'

I tried to load the samtools / tabix module in the job on our cluster, but the same error message appeared, so I assume it cannot find the dependency?

I also checked the input VCF file with structural variants and position 7567 is actually before 8840, and I have sorted it before with bcftools sort. So I don't think this could be the issue?

hannespetur commented 4 years ago

It won't need to find the dependency, htslib (which has tabix functions) is statically linked with graphtyper so the code is there. It must be something else. Could you send me /scratch/123192880_9.tmpdir/graphtyper_200515_080150_chr9_000000001.2YIUaV/graphtyper.vcf.gz ? (it is enough to only give me the variant site information which is the first 8 columns of the VCF).

hannespetur commented 4 years ago

Nevermind, no need to send me the VCF. I simulated some reads on your reference and could reproduce your problem. It should now be fixed in the new v2.5.1.

Best, Hannes

marqueda commented 4 years ago

Dear Hannes,

Sorry for the late reply, I was offline since Friday mid-day. Unfortunately our computer cluster has been closed again due to the cyber attack on ~16 European HPC centers. It's unknown when they will open again (days to weeks), but I will let you know once I gave the new version v2.5.1 a try!

Thanks again and best, David

marqueda commented 4 years ago

Dear Hannes, Our cluster is back to life and graphtyper_sv 2.5.1 is running successfully, both on SVs only and on a VCF file with SNPs/indels and Manta SVs. Thanks for fixing the issue! Best, David

hannespetur commented 4 years ago

Great! Thanks.