Closed bbimber closed 2 years ago
Also, have you considered rules to parse the constant genes? The data are available:
http://www.imgt.org/genedb/GENElect?query=7.14+TRAC&species=Macaca+mulatta http://www.imgt.org/genedb/GENElect?query=7.14+TRBC&species=Macaca+mulatta http://www.imgt.org/genedb/GENElect?query=7.14+TRDC&species=Macaca+mulatta
The one gotcha appears to be that repseqIo fromPaddedFasta does not like passed FASTAs with multiple entries per gene. Since we really only care about EX1, perhaps their API would support another filter?
http://www.imgt.org/genedb/GENElect?query=7.14+TRAC&species=Macaca+mulatta&IMGTlabel=EX1
"IMGTlabel=EX1" works on other IMGT actions, but not this one. I cant find documentation on their APIs or another functional query page to inspect to discover whether we can filter on feature here. Have you tried anything like this before?
for example (not completely working)
{
"taxonId": 9544,
"speciesNames": [
"rhesus_monkey",
"macaca_mulatta"
],
"rules": [
{
"ruleType": "import",
"output": "output/rhesus_monkey_C_TRA",
"geneType": "C",
"chain": "TRA",
"anchorPoints": [
{
"point": "CBegin",
"position": 0
},
{
"point": "CExon1End",
"position": -1
}
],
"sources": [
"http://www.imgt.org/genedb/GENElect?query=7.14+TRAC&species=Macaca+mulatta&IMGTlabel=EX1"
]
}
]
}
@dbolotin @PoslavskySV: IMGT recently released TRG for rhesus macaque. I think supporting download of these data from the IMGT looks pretty simple - does this seem like a reasonable addition? I copied the anchor points from TRBV. In human, TRBV/TRVG were identical, so i assume that's just how IMGT formats them.