Closed fsimkovic closed 7 years ago
This is something I was thinking of adding in. I'm also going to write the code so that the terminate early option can be turned off. The lattice parameter search will effectively identify any structure that is already in the PDB which makes testing the contaminant and full run more difficult with known structures.
What about a partial model that might bypass this single step? Do you think this is possible and/or worth the trouble?
A partial model wouldn't work, however modifying the lattice parameters in an MTZ file so that the correct solution wasn't picked up could work, although this could also prevent further steps from working properly.
Allowing the code to bypass this step even if a solution has been found will be useful nonetheless for seeing how fast SIMBAD is on local machines.
Sure that makes sense.
@hlasimpk thinking about this, it might be a good idea to have examples for each of the three subparts, i.e.
simbad-lattice
scriptsimbad-contaminant
scriptsimbad-full
scriptsimbad-create-lattice-db
scriptThen, explain in the docs that the simbad
script only combines the individual parts but won't work reliably in the future ...
@fsimkovic making the examples for simbad-lattice
and simbad-contaminant
will be straight forward. simbad-full
will be a little more complicated because if the lattice search is up to date it will solve the structure. I guess we could give the full run the -early_term False
flag and explain that this flag isn't required for an actual run.
Do we need an example for the simbad-create-lattice-db
script? it's very simple to run.
@hlasimpk I thought the simbad-full
script would be specific to running all PDB structures against your data, i.e. the third step int he simbad
script?!
I think we should include a short page on simbad-create-lattice-db
, just to explain what it is, what it does, why it is important and how you can create a manual database if you don't want to update the one in CCP4. Also, linked to the last point, people might use a CCP4 version on a network-managed system, where they either do not have permissions to write to the directory of the database or cannot update to not interfere with others' work. These people might need help on how to set up their own database ...
Not all structures in the PDB, just those in the morda database. I can create a simbad-full
page however it's important that this isn't run by default locally. I want to have simbad-main
run simbad-lattice
and simbad-contaminant
and for simbad-full
to be preserved for those who own clusters.
I understand, I thought simbad-full
was the script solely for the last step. What about 6 scripts:
simbad
[simbad-main
] --> simbad-lattice
+ simbad-contaminant
simbad-full
--> simbad-lattice
+ simbad-contaminant
+ simbad-morda
simbad-lattice
simbad-contaminant
simbad-morda
--> full rotation search of MoRDa dbsimbad-create-lattice-db
We could write subparsers in the simbad_main
module which would cover the cases from above. This would basically work like git
on the command line, where all of git clone
, git add
, git commit
are individual subparsers under the git
command.
Thus, there would be a single simbad
script, but subparsers simbad lattice
, simbad contaminant
, etc. would internally handle which routines to run. This might make it all a bit messier but less command-line scripts.
I think the 6 scripts idea makes sense. The subparsers idea we can have a think about.
I think the best idea is to create a simbad-examples repository. Deposit something like TOXD and potentially one or two others which can be used on the website as examples.