bluenote-1577 / flopp

flopp is a software package for single individual haplotype phasing of polyploid organisms from long read sequencing.
33 stars 7 forks source link

How to create a phased references in FASTA format ? #13

Closed amvarani closed 11 months ago

amvarani commented 11 months ago

Hi there, I appreciate your effort in creating Flopp. I have a very practical question I'd like to ask: How can I create phased references in FASTA format using your tool?

Best wishes,

bluenote-1577 commented 11 months ago

Hi @amvarani,

To create a phased fasta, you have two choices.

  1. You can just replace bases in the reference fasta with the output of flopps haplotypes along with VCF information. So if one haplotype had all alleles labeled as 1, you can replace your reference with the alternate alleles on the bases specified by your vcf.

  2. Maybe the better option is to assemble each group of read partitions output by flopp. Flopp can output groups of reads in each haplotype, and you can assemble each of these separately to get a new set of references.

Thanks,

Jim

amvarani commented 11 months ago

Thanks @bluenote-1577

It seems very simple using the get_bam_partition.py script to pick each bam, and them pbindex and bam2fastq to retrieve the fastq files.

Best

Alessandro