rigdenlab / SIMBAD

Sequence Independent Molecular replacement Based on Available Database
http://simbad.rtfd.io
BSD 3-Clause "New" or "Revised" License
3 stars 7 forks source link

Add Example data #3

Closed fsimkovic closed 7 years ago

fsimkovic commented 7 years ago

I think the best idea is to create a simbad-examples repository. Deposit something like TOXD and potentially one or two others which can be used on the website as examples.

hlasimpk commented 7 years ago

This is something I was thinking of adding in. I'm also going to write the code so that the terminate early option can be turned off. The lattice parameter search will effectively identify any structure that is already in the PDB which makes testing the contaminant and full run more difficult with known structures.

fsimkovic commented 7 years ago

What about a partial model that might bypass this single step? Do you think this is possible and/or worth the trouble?

hlasimpk commented 7 years ago

A partial model wouldn't work, however modifying the lattice parameters in an MTZ file so that the correct solution wasn't picked up could work, although this could also prevent further steps from working properly.

Allowing the code to bypass this step even if a solution has been found will be useful nonetheless for seeing how fast SIMBAD is on local machines.

fsimkovic commented 7 years ago

Sure that makes sense.

fsimkovic commented 7 years ago

@hlasimpk thinking about this, it might be a good idea to have examples for each of the three subparts, i.e.

Then, explain in the docs that the simbad script only combines the individual parts but won't work reliably in the future ...

hlasimpk commented 7 years ago

@fsimkovic making the examples for simbad-lattice and simbad-contaminant will be straight forward. simbad-full will be a little more complicated because if the lattice search is up to date it will solve the structure. I guess we could give the full run the -early_term False flag and explain that this flag isn't required for an actual run.

Do we need an example for the simbad-create-lattice-db script? it's very simple to run.

fsimkovic commented 7 years ago

@hlasimpk I thought the simbad-full script would be specific to running all PDB structures against your data, i.e. the third step int he simbad script?!

I think we should include a short page on simbad-create-lattice-db, just to explain what it is, what it does, why it is important and how you can create a manual database if you don't want to update the one in CCP4. Also, linked to the last point, people might use a CCP4 version on a network-managed system, where they either do not have permissions to write to the directory of the database or cannot update to not interfere with others' work. These people might need help on how to set up their own database ...

hlasimpk commented 7 years ago

Not all structures in the PDB, just those in the morda database. I can create a simbad-full page however it's important that this isn't run by default locally. I want to have simbad-main run simbad-lattice and simbad-contaminant and for simbad-full to be preserved for those who own clusters.

fsimkovic commented 7 years ago

I understand, I thought simbad-full was the script solely for the last step. What about 6 scripts:


Separate idea

We could write subparsers in the simbad_main module which would cover the cases from above. This would basically work like git on the command line, where all of git clone, git add, git commit are individual subparsers under the git command.

Thus, there would be a single simbad script, but subparsers simbad lattice, simbad contaminant, etc. would internally handle which routines to run. This might make it all a bit messier but less command-line scripts.

hlasimpk commented 7 years ago

I think the 6 scripts idea makes sense. The subparsers idea we can have a think about.