eblerjana / pangenie

Pangenome-based genome inference
MIT License
114 stars 10 forks source link

Assertion error when running PanGenie with my VCF file #49

Open LeoHongboWANG opened 1 year ago

LeoHongboWANG commented 1 year ago

Hello,

When I try to run PanGenie using my VCF file, I get an error message. Here is the error message I receive:

GraphBuilder: skip variant at chr1:1000404 since it is contained in a previous one. PanGenie:/biosoft/pangenie/src/graph.cpp:80: void Graph::add_variant_cluster(std::vector<std::shared_ptr >*, std::vector<std::vector<std::__cxx11::basic_string > >&, bool): Assertion `defined_alleles.size() == (variant_ids[i].size()+1)' failed. /var/spool/job55890972/slurm_script: line 30: 11280 Aborted

I ensured the reference genome matches the variants in the VCF file. However, I continue to encounter this error.

Would you be able to assist me in understanding what is causing this problem? Could you please let me know if there are any specific requirements or formats for the VCF file that I have overlooked? Thank you for your help. Best, Hongbo

eblerjana commented 1 year ago

Hi Hongbo,

which version of PanGenie are you using? It looks like there are a few issues with your input VCF. Variants must not be overlapping and need to be merged first (see README). I assume the assertion error comes from missing variant IDs for some variants. If there is a ID field in your input VCF, there needs to be one ID per allele.

If you like, you can send me your input VCF to this email: ebler(at)hhu.de, then I can have a closer look.

Best, Jana