Closed Isoris closed 7 months ago
Set things up so that the number of haploid copies you expect in your summaries is the number that's given to PGGB. You'll need to investigate each assembly to see if it's haploid (collapsed) or diploid.
Also I suggest renaming the FASTA sequences with PanSN format. If the assemblies are haploid collapsed then you'd have something like fish1#1#accession for each.
Thank you Eric.
Hello,
thanks for making PGGB, I have a question, I am running an analysis and would like to align 18 species of catfish, they are diploid individuals however when curling the assemblies GCA in genbank we have only around 28+ chromosomes and around 47 scaffolds to 300+ scaffolds based on the assembly quality. I would like to know if the genomes are haploid representations of the diploid genome i.e., if the assemblies are primary the two haplotypes are collapsed. Therefore should PGGB be run with haplotype = 1 ?