Open boyangzhao opened 3 years ago
Hi, yes the hla_read_alignment
function should process the gen files. The "cds" term is a bad naming convention which comes from the times when I only used the nuc files. Thanks!
Thanks! In addition to hla_read_alignment
, which processes the gen files, does the methods hla_compile_index
and script make_index_files.R
work on gen files? I presume these are more customized toward nuc files? I'm also interested to generate a complete nuc fasta, by fill-ins with the closest alleles at the genomic dna level.
Correct, those are intended to be used with nuc files. In principle, you could tweak the code of hla_compile_index so it reads gen files with hla_read_alignment, but that was never tested, and I cannot foresee the possible problems. If you want to use gen files to compute distances across alleles, you probably will need to better model large insertions and deletions in the introns.
On Oct 14, 2021, at 12:38 PM, Boyang Zhao @.***> wrote:
Thanks! In addition to hla_read_alignment, which processes the gen files, does the methods hla_compile_index and script make_index_files.R work on gen files? I presume these are more customized toward nuc files? I'm also interested to generate a complete nuc fasta, by fill-ins with the closest alleles at the genomic dna level.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/genevol-usp/hlaseqlib/issues/1#issuecomment-943528342, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA5SFU3YNQZSC3XJIGAEJRTUG4BPFANCNFSM5FAVTMHQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Hi - I stumble upon your tool HLApers and the associated library hlaseqlib. I find the method
hla_read_alignment
quite interesting as you know, the IMGT database does not contain the complete sequences. I was wondering if this method can be used for processing genomic sequences (the _gen files)? I see it mentionscds
and don't know how hard-coded it is toward cds sequences only? I do seehla_read_alignment
has the ability to process either nuc or gen files.