Closed martingraham closed 6 years ago
ah, looks like the chains A and B are of length 565 and 568 in 5UJB so I reckon not NGL's fault
The trouble is I'm asking one of rcsb's web services for the best match against a HSA sequence I have and it returns 5UJB as the top match - even though 1AO6 actually has slightly longer chains it falls into 2nd place because the sequence it has (seqres) is slightly shorter - the chains in 5UJB cover less of the sequence - I guess I'll go and figure it out from here and complain to rcsb if I can't get round it :-)
I should read up on this
Which service at rcsb did you use? Feel free to send a message on https://www.rcsb.org/pages/contactus. Mapping of sequences to PDB entries is often not straightforward. There is the SIFTS project (http://www.ebi.ac.uk/pdbe/docs/sifts/overview.html) which provides up-to-date mappings between sequences of different resources.
It's the BlastPDB service
When I let that off it returns a large number of hits ordered by score of which 5UJB is top and say 1AO6 is towards the middle. Both these PDBs contain the input sequence exactly, but 5UJB has some extras at the start which seem to give it a slightly higher score (The difference in scores isn't massive). The trouble is that the chains in 5UJB aren't as long as the ones in 1AO6, so it's actually a slightly worse PDB to use for my purposes, but there's no way to tell this from the returned data. Like you say, I'll fire this in as a question to RCSB
Sounds good to ask them, thanks.
Hi Alex, when I load in pdb id 5UJB into NGL, I get 2 chains, one of length 568 (A) and one of length 565 (B)
However, the fasta file for the pdb file swears blind both chains have 604 residues in them
Looking at the PDB file in a text editor I'm suspicious that it starts numbering its' residues at a negative index (-23). Is this legal for PDBs, and if it is, could it be causing a problem for NGL?
https://www.rcsb.org/structure/5UJB