Closed bbimber closed 2 years ago
I love IMGT 🤦♂️... I'll look into it.
Thanks. Two other questions/thoughts: I wrote a quick parser to convert the IMGT flatfiles to repseqio libraries. This is a richer source than the padded FASTAs. When doing the this, I based the sequence off the refseq source, rather than a local FASTA. this is a little more robust than the local FASTA since you also have flanking genomic data. I dont know how many species this applies to, but assuming IMGT is moderately consistent I think the parser could be relatively general purpose. I did need to convert a little from IMGT's anchor point definitions to MiXCR; however, that wasnt too difficult.
Hello,
If i understand these rules correctly, the V genes do not return L-PART, since the 7.14 query used by at least human and macaque relies on the padded FASTA that doesnt include L1/L2. This type of URL will return the coordinates for L-REGION:
http://www.imgt.org/genedb/GENElect?query=8.1+TRAV&species=Macaca+mulatta&IMGTlabel=L-PART1
is there a way to pull from multiple URL sources to generate a library?