DecodeGenetics / graphtyper

Population-scale genotyping using pangenome graphs
http://dx.doi.org/10.1038/ng.3964
MIT License
167 stars 20 forks source link

Long-read assemblies #71

Open z1242099843 opened 3 years ago

z1242099843 commented 3 years ago

I'd like to include the Long-read assemblies file when identifying structural variations.How do I get Long-read assemblies SVS_VCF files?

z1242099843 commented 3 years ago

Precisely, how to get the BAM file of Long-read assemblies file?

hannespetur commented 3 years ago

Hello, if you some assemblies in FASTA you can generate a VCF (not BAM) using dipcall, see: https://github.com/lh3/dipcall

This is what we used in our SV paper.

Best, Hannes

z1242099843 commented 3 years ago

Hello, if you some assemblies in FASTA you can generate a VCF (not BAM) using dipcall, see: https://github.com/lh3/dipcall

This is what we used in our SV paper.

Best, Hannes

Thank you,I got it. But I used dipcall to get two VCF: prefix.pair.vcf.gz,prefix.dip.vcf.gz. The prefix.dip.vcf.gz was added to VCF list to be used by svimmer. But it prompts an AssertionError. Here is the Error: multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/public/home/yao/miniconda3/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, **kwds)) File "/public/home/yao/miniconda3/lib/python3.7/multiprocessing/pool.py", line 47, in starmapstar return list(itertools.starmap(args[0], args[1])) File "/public/home/yao/bin/svimmer", line 75, in append_svs_from_vcf svs.append(SV(record, check_type=not args.ignore_types, join_mode=args.join_mode, output_ids=args.ids)) File "/public/home/yao/software/svimmer-0.1/sv.py", line 75, in init assert False AssertionError """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/public/home/yao/bin/svimmer", line 156, in results = pool.starmap(append_svs_from_vcf, zip(lines, itertools.repeat(chrom)), chunksize=1) File "/public/home/yao/miniconda3/lib/python3.7/multiprocessing/pool.py", line 276, in starmap return self._map_async(func, iterable, starmapstar, chunksize).get() File "/public/home/yao/miniconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get raise self._value AssertionError

Should I use other VCF file?

z1242099843 commented 3 years ago

I used the prefix.pair.vcf.gz to svimmer, but it prompts the same error.

hannespetur commented 3 years ago

It has issues with determining the SV type of some record. I will have a look but for now, use the "O" option in python:

$ python -O svimmer ....
z1242099843 commented 3 years ago
python -O 

Thank you! Now, The prefix.hap1.bam which was got by dipcall to the list of bams file was added the list of bams for genotype_sv. But the prefix.hap1.bam was Single-end, the bam of samples are double-end. Does this affect the result of genotype_sv?