human-pangenomics / HPP_Year1_Assemblies

Assemblies from HPP Year 1 production
64 stars 8 forks source link

Finding human chromosomes in Pangenome assemblies #5

Open tomateba opened 1 year ago

tomateba commented 1 year ago

Hi, we recently downloaded the Pangenome assemblies in the AGC format, and we extracted individual assemblies in fasta format. We noticed that contig names do not match human chromosome names. Are we missing something? Could you please help us to associate fasta records in individual assemblies to human chromosomes?

Thank you!

ekg commented 1 year ago

You will need to align the contigs to reference assemblies that are scaffolded like chm13 grch38 and hg002 to derive an assignment.

There are some regions like the PARs and acrocentric PHRs that will be ambiguous.

On Fri, Jul 14, 2023, 15:38 Toma Tebaldi @.***> wrote:

Hi, we recently downloaded the Pangenome assemblies in the AGC format, and we extracted individual assemblies in fasta format. We noticed that contig names do not match human chromosome names. Are we missing something? Could you please help us to associate fasta records in individual assemblies to human chromosomes?

Thank you!

— Reply to this email directly, view it on GitHub https://github.com/human-pangenomics/HPP_Year1_Assemblies/issues/5, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEM5WB6JKXBMVGS4VODXQFDTVANCNFSM6AAAAAA2KLB3TE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

diekhans commented 1 year ago

You can use the HPRC UCSC assembly has chain alignment files

https://hprc-browser.ucsc.edu/