Closed sillitoe closed 7 years ago
PatternQuery isn't open source at the moment (wasn't really my decision at the time, but eventually I think I will put the entire WebChemistry thing on GitHub as well) and is actually written in C#. The query language implementation in LiteMol only uses the ideas from PQ.
The queries currently available in LiteMol are here.
What you are asking for can be accomplished using the query
sequence('1' /* entity id */, 'B' /* asymId */, { seqNumber: 15 }, { seqNumber: 200, insCode: 'A' })
Thinking about it, this might cause some trouble, because there is sometimes a conflict between _atom_site.label_asym_id
and _atom_site.auth_asym_id
in mmCIF, so the signature of sequence
for the chain should also have the option to be an object instead of just a string. I will fix this.
I am also planning to include code completion for the queries to LiteMol (another thing on the TODO list :) )
ResidueIdRange(chain, a, b)
from PatternQuery takes chain id and the numbers determine an interval given by the the sequence number of the residues (the _atom_site.auth_seq_id
). So it would be equivalent to sequence('1', chain, { seqNumber: a }, { seqNumber: b })
with no ability to specify insertion code.
A limited set of queries is also available in the CoordinateServer (or the same thing running at PDBe, but not always the latest version) which is running LiteMol code using Node.js. The set of queries on CoordinateServer can be easily extended and you can somewhat easily run it on your own data (it will be open sourced soon as well).
+Edit:
Improving the queries in the "original" PatternQuery is also an option and I might do the improved version of the sequence
there as well.
And sorry, I got a little carried away in the response and forgot to answer what you were actually asking about.
It should be easy to add support for the "index" based selection (at least using 0 based indices) as this is something that is included in the internal LiteMol representation. I will think about it a bit more and include it.
Thanks.
The queries currently available in LiteMol are here.
That's a really useful link for documentation - are there plans to autogenerate API docs?
sequence('1' /* entity id /, 'B' / asymId */, { seqNumber: 15 }, { seqNumber: 200, insCode: 'A' })
Excellent, thanks
Yes, there are plans for autogenerated docs.
I am hoping that writing JSdoc comments and running JSdoc on the generated JavaScript will work. Haven't found anything useful to generate docs automatically from TypeScript yet.
If this does not work, there will definitely be at least a documentation in the style of PatternQuery for the query language.
I did have a quick look at these, but tsdoc had no activity for 3 years now and the language changed since.
I tried the demo (default theme page) for typedoc and it was not working so looked no further. But somehow I missed the typedoc.org page which looks maintained so I will give it a shot and see how it works.
On a "side note" related to the the sequence annotation, on mouse hover is already available in the Viewer app (accessed thru the elements in the red rectangles).
The source code for that is here, in case you would like to include it in CATH ;)
fwiw - looks like TypeDoc is maintained by the same people who wrote the TypeScript package for Atom (which seems to work well for me).
Thanks - I'm currently trying to see if I can get our superfamily superpositions to work (all non-redundant domains for a given superfamily). Figure if this plugin can display millions of atoms, it should be able to show ~100 structures on top of each other.
http://sillitoe.cathdb.info/superfamily?sfam_id=1.10.8.10
[NB: that's a test server - link probably won't work beyond today...]
Currently I'm loading previously superposed PDB files from my server - I figured it might be a better idea to select just the domain coordinates from PDBe server and apply the appropriate transformation...
Yes, 100s of structures should not be a problem in general.
It might be interesting to create a bundle from these structures and send them all in a single response, packed in BinaryCIF. Might be an interesting feature for the CoordinateServer actually. The query would look something like /1tqn,1cbs,1jj2/cartoons?encoding=bcif
and would return a result with multiple data blocks, sending only the atoms required for cartoon representation.
Loading the originals and applying the transforms should not be very hard either, I would add a transform that would take the original model and copy just the transformed positions, reusing all the other data, making it very memory efficient as well (I am not too keen on mutating data in the LiteMol state).
Thoughts?
Also looking at your app, I think I will add the ID of the highlighted molecule to the highlight label.
And I've also just remembered that there is a limitation of 255 structures for which the highlight will work simultaneously. If this is an issue, I have ways of fixing this.
Sorry, meetings.
Yup, this was meant to be a simple proof of concept - using the local data we already have available. Figured it will be good to use this as an excuse to get stuck into LiteMol (I think we may already be semi-official collaborators on other projects :)
Moving over to CoordinateServer sounds like it would potentially make the tool much faster and more portable (which would be great).
Realistically, it will probably take me a while to get my head around customising the dashboard, let alone applying matrix transforms/rotations (I enjoy working on front-end stuff, but I don't get much time to do it).
Maybe I should start up a separate GitHub project - could act as a "how to implement your own application" tutorial. Might make it easier for you to point me in the right direction when I'm floundering...
What I propose would not require new UI elements. And applying a matrix transform to a molecule is something that should be a part of the core anyway.
Let me make an example app that downloads a bunch of structures from the coordinate server and applies a bunch of semi-random transforms to them (I actually have an implementation of the quaternion superposition algorithm in TypeScript, so I will use something like first 10 C-alphas for each structure to superpose them; it would be quite cool to do the domain superpositions directly on the client as well ... should not be that hard if its just 100s of structures).
From you API, you can then just serve some JSON that contains a list of PDB ids and the corresponding transformation.
As a next step, I could then add the ability to query multiple structures at the same time using the CoordServer. This would actually be a very nice use case for it and BinaryCIF and would look very good in a publication.
Sounds fantastic, many thanks.
We should probably move this discussion to a new GH issue :)
On Fri, 11 Nov 2016, 20:20 David Sehnal, notifications@github.com wrote:
What I propose would not require new UI elements. And applying a matrix transform to a molecule is something that should be a part of the core anyway.
Let me make an example app that downloads a bunch of structures from the coordinate server and applies a bunch of semi-random transforms to them (I actually have an implementation of the quaternion superposition algorithm in TypeScript, so I will use something like first 10 C-alphas for each structure to superpose them; it would be quite cool to do the domain superpositions directly on the client as well ... should not be that hard if its just 100s of structures).
From you API, you can then just serve some JSON that contains a list of
PDB ids and the corresponding transformation.
As a next step, I could then add the ability to query multiple structures at the same time using the CoordServer. This would actually be a very nice use case for it and BinaryCIF and would look very good in a publication.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/dsehnal/LiteMol/issues/8#issuecomment-260046827, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJVepe4k5ZFWF9pLOwyXUM325-JMbnlks5q9M4UgaJpZM4Kv4aj .
I've added the Transforms example. And yes, we should probably move this conversation elsewhere :)
Nice! Great work.
Useful to see the ligands in there too.
Will need to do some work to get our backend providing these transforms.
I'll create a some GH issues for documentation...
(this is an issue with PatternQuery rather than LiteMol, but I can't see a separate repo and I'm guessing they overlap...)
I would like to load a restricted set of residues based on structural domain boundaries (where a structural domain can consist of 1 or more regions of protein structure).
The PatternQuery documentation has a couple of functions that look useful:
Select individual residues by id:
Select range of residues by index:
[NB: should this be 'ResidueIndexRange'??]
Either way - in the latter case - I presume the "index" coordinate system is based on the sequential ordering of ATOM records (or mmCIF equivalent), starting from 1? If so, this seems to be more useful as an internal coordinate system rather than for an external user (unless you can access the id to index lookup).
It would be really useful to be able to select a range based on a start and stop residue id as well as index.
e.g. selecting all residues between residues '15' and '200, insert code: A' in chain B...
something like:
or