Open mzueva opened 6 months ago
Currently we get protein sequences by sequence ID from NCBI. For parasites targets we usually don't have NCBI data. So we get proteins secuences from data regitered locally:
Therefore, the approach is:
if sequence protein already has field with aimno-acids, we don't have to call server api, we already have enougth information
Example
"sequences": [
{
"protein": {
"name": "MCP9258909.1",
"length": 554,
"baseString": "MANILKEIPRAVINIFQYTSIHVFTDGPPKHMQRQFLCETESDNVLDFCQIKRSPIKTMAIPKLELLAILIGVRAAQFVIKQLEFENAQVILWSDSRCALHWIQNHSRLLQRFIQNRVEEIRKAKFAYRYIPSECNPANIATKAISPSDLANLTLCDNEETIDEEREQVVVTAIQEATKTSIRFVDANRFSNWSRMVRTTGIQITPYEYEFAVELLLRQAQSEGLSVEEITKRNLYYVMGLWKFKGRLQFPSSGSCISYLTYLPRHNRITEIIIQTYHEKIHHGDIPHTISELRRLYWIPKERAEVKKKAKSFKLPPMPDYHDSRTVRSKIFARIGLDYLGPVTAKTEVGMAKKSFLTALRKFVARRDCPELILSDNASQFHLIYRTIKKQESQLSNFLTSKGIIWKYITQKAPWSGGIYERIVGITKGAFRKAVDEYLNSLRERTQIEHKSPRGAITRSPSLGLINEPHIPRGMWKLAKINKLNKSSDGNVRSVQIELPFGKLLNRQVNMLYPLEAEQEDQPEDSVTELMDAKDEEPIARRTQVQQRSYELQL"
}
}
]
if there is no proteins for specified gene, but there are mrna sequences without ids
Example
{
"geneId": "vbb35395.1",
"reference": {
"id": "34",
"name": "GCA_900537255"
},
"sequences": [
{
"mrna": {
"name": "vbb35395.1",
"begin": 8,
"end": 689,
"strand": "NEGATIVE",
"featureFileId": 34,
"chromosomeId": 6223
}
},
{
"mrna": {
"name": "vbb35395.1",
"begin": 861,
"end": 964,
"strand": "NEGATIVE",
"featureFileId": 34,
"chromosomeId": 6223
}
}
]
}
, we should use POST /restapi/sequence/local endpoint with request body
{
"database": "PROTEIN",
"referenceId": ?,
"featureFileId": ?,
"chromosomeId": ?,
"begin": ?,
"end": ?
}
where referenceId = reference.id, featureFileId = featureFileId, chromosomeId = chromosomeId, begin = begin, end = end
Background
Approach
Other options