Closed sillitoe closed 7 years ago
Yes - that's what I'd do.
The only thing is I'm not 100% sure what the "for each"s in the bullet points mean. Once you've run cath-superpose-multi-temp-script
, you should only need to run two cath-superpose
commands...
First, run the command you're already running but output JSON via --sup-to-json-file
, eg:
cath-superpose --ssap-scores-infile my_ssap_scores --pdb-infile $PDBDIR/1aldA00 --pdb-infile $PDBDIR/1b57A00 --pdb-infile $PDBDIR/1fq0A00 --pdb-infile $PDBDIR/1ok4A00 --sup-to-json-file my_sup.json
Second, run a similar command but:
--json-sup-infile
,cath-superpose --json-sup-infile my_sup.json --pdb-infile $PDBDIR/1ald --pdb-infile $PDBDIR/1b57 --pdb-infile $PDBDIR/1fq0 --pdb-infile $PDBDIR/1ok4 --sup-to-pymol-file my_sup.pml
Let me know how you get on. Shout if you get stuck.
@tonyelewis, @sillitoe - Thank you so much but I got this error while trying to parse the file "Error in parsing program options (from the command line): unrecognised option '--sup-to-json-file'"? What can I do about this? Thank you.
It sounds like you're running an old version of cath-superpose
. Check that by running cath-superpose --version
. You'll need v0.12.15 to read in JSON files.
To get the latest, download it from here and remember to make it executable (chmod +x cath-superpose
). If you're running on CentOS 6/7 rather than Ubuntu, use /opt/local/apps/CentOS6-x86_64/bin/cath-superpose
instead.
Any joy?
I guess I cannot currently write to .json file as the i got an output stating "Whilst converting a superposition_context to JSON, its alignment will be ignored because that is not currently supported". I hope the joy will come soon if this can be sorted.
That should just be a warning message, which is indicated by [cath-superpose|warning]
, eg:
2016-11-23 13:22:53.144267 [cath-superpose|warning] Whilst converting a superposition_context to JSON, its alignment will be ignored because that is not currently supported
Please check: is the superposition JSON file there?
Yes I have it in my folder.
Great. Then you can try using that file in the second command (ie cath-superpose --json-sup-infile [...]
above). Let me know how you get on.
Am running into an error. I think it comes down to the --ssap-scores-infile
option
Using the latest binary:
$ ./cath-superpose --version
============
cath-superpose v0.12.15-0-gc6003f7 [2016-11-18]
============
Superpose protein structures using an existing alignment
Build
-----
Nov 18 2016 18:43:53
Clang version 3.6.2 (branches/release_36)
GNU libstdc++ version 20160726
Boost 1_57
SSAP scores for 1jd0B00 / 1kopA00 exist in the scores:
$ grep 1jd0B00 superpositions/ssap_scores.fb5c4352b69d0f36976e8114e2653da7 | grep 1kopA00
1jd0B00 1kopA00 259 223 87.41 214 82 30 1.47
Running superpose generates exception:
$ ./cath-superpose --ssap-scores-infile ./superpositions/ssap_scores.fb5c4352b69d0f36976e8114e2653da7 --pdb-infile 1jd0B00 --pdb-infile 1kopA00 --sup-to-pymol-file tmp.pml
Whilst running program ./cath-superpose (via a program_exception_wrapper with typeid: "N4cath40cath_superpose_program_exception_wrapperE"), caught a std::exception:
vector::reserve
Note this works with the pairwise alignment:
$ ./cath-superpose --pdb-infile 1jd0B00 --pdb-infile 1kopA00 --ssap-aln-infile superpositions/1jd0B001kopA00.list --sup-to-pymol-file tmp.pml
Standard RMSD is : 1.47086
Superposed using select_best_score_percent[70].ca_atoms and actual full RMSD is : 1.48231
Any ideas?
I think your problem is...
For now, the --ssap-scores-infile
option is brittle: the list of PDBs that you specify with --pdb-infile
must exactly correspond to the list of IDs in the scores file and must appear in the same order that the IDs first appear in this file.
Looked at another way: cath-superpose-multi-temp-script
makes it easy for you by providing a command with the correct --ssap-scores-infile
and --pdb-file
options; don't change them. If you want to superpose a subset, just run another cath-superpose-multi-temp-script
to generate a new scores file and get a new command. If you run that in the same temporary directory as before, it'll re-use your existing SSAP results so should be really quick.
In some ways the fact that cath-superpose
isn't mapping between the scores file's IDs and the filenames is a bit rubbish but then that's what's giving us the flexibility for you to substitute in completely different PDB files to superpose whole PDBs.
The error message you got is very unhelpful - I'll have a look at improving that.
Thanks.
We also managed to get a segmentation fault - but having more difficulty reproducing that one.
Seems to work but it has a problem with the colouring scheme without an alignment?
$ ./cath-superpose --json-sup-infile aceta.json --pdb-infile $PDBDIR/1keq --pdb-infile $PDBDIR/1v9e --pdb-infile $PDBDIR/3iai --pdb-infile $PDBDIR/3ks3 --pdb-infile $PDBDIR/3ml5 --pdb-infile $PDBDIR/4k13 --sup-to-pymol-file aceta.pml
2016-11-24 18:01:06.667486 [cath-superpose|warning] Unable to apply a alignment-based coluring scheme to the superposition because it doesn't contain an alignment
Is it possible to turn off the coloring?
@sillitoe : OK - please do open an issue if you do manage to pin the segfault down. Ta.
@toluadeyelu : Great - so is that now doing what you want (superposing whole PDBs, including any ligands etc, based on domains?)? The thing about colouring is just a warning that you can ignore (which in your case is irrelevant because you haven't requested an alignment-based colouring, but I'd rather spend time on adding the alignment than making this warning smarter).
@toluadeyelu : BTW, I've found commands like the following useful for viewing ligands in PyMOL before:
bg_color white;
show_as sticks, hetatm;
colour black, hetatm;
Sorry, that was my fault - didn't notice this was only a warning.
Is it worth having a done
line at the end of the output to make it really obvious that everything ran okay (and the warnings are just warnings). Or would that screw up other output options?
Or something like:
Writing PyMOL output file 'tmp.pml' ... done
It seems the ligand is been stripped off after superposition as it does not appear in the pymol output (No hetatm found)
$ grep HET $PDBDIR/3ml5|head
REMARK 3 HETEROGEN ATOMS : 14
REMARK 290 THE FOLLOWING TRANSFORMATIONS OPERATE ON THE ATOM/HETATM
HET ZN A 263 1
HET AZM A 264 13
HETNAM ZN ZINC ION
HETNAM AZM 5-ACETAMIDO-1,3,4-THIADIAZOLE-2-SULFONAMIDE
HETATM 2101 ZN ZN A 263 1.215 -0.903 18.853 1.00 6.67 ZN
HETATM 2102 C1 AZM A 264 -2.546 -2.796 19.609 1.00 9.73 C
HETATM 2103 C2 AZM A 264 -4.519 -2.424 20.815 1.00 12.71 C
HETATM 2104 C3 AZM A 264 -6.604 -1.110 20.909 1.00 13.86 C
$ ./cath-superpose --json-sup-infile aceta.json --pdb-infile $PDBDIR/1keq --pdb-infile $PDBDIR/1v9e --pdb-infile $PDBDIR/3iai --pdb-infile $PDBDIR/3ks3 --pdb-infile $PDBDIR/3ml5 --pdb-infile $PDBDIR/4k13 --sup-to-pymol-file aceta.pml
2016-11-24 18:27:17.470788 [cath-superpose|warning] Unable to apply a alignment-based coluring scheme to the superposition because it doesn't contain an alignment
$ ./cath-superpose --json-sup-infile aceta.json --pdb-infile $PDBDIR/1keq --pdb-infile $PDBDIR/1v9e --pdb-infile $PDBDIR/3iai --pdb-infile $PDBDIR/3ks3 --pdb-infile $PDBDIR/3ml5 --pdb-infile $PDBDIR/4k13 --sup-to-pymol-file aceta.pml
2016-11-24 18:27:53.058817 [cath-superpose|warning] Unable to apply a alignment-based coluring scheme to the superposition because it doesn't contain an alignment
$ grep HET aceta.pml
Hmm. Not sure. I can see the value to making it clearer that warnings are only warnings.
My concern about adding something like a trailing done
is that it adds noise, which makes the program that little bit more annoying to use, especially when running batches etc, and it also makes genuine warnings/errors that little bit less obvious.
(Of course, the tools currently have too much noise but I hope for that to slowly improve.)
Alternatives could be to make the warnings more clearly warnings somehow. Move the warning
before the cath_superpose
? Make it upper case? Use colour (eg yellow for warning; red for error)? I think colour is tricky to do portably (and, who knows, we might choose to build on Windows in the future).
Any thoughts?
I'm unsure. I'd like to mull that one for a while.
@toluadeyelu
No hetatm found
Great - that's a really clear issue - thanks very much for highlighting. I know I've looked at this before and generated a superposition including HETATM records but I'm not sure what state the code got left in.
Please can you open this as a separate GitHub issue (to distinguish the bug from this question issue) and include the list of CATH domains you're trying to superpose so I can look into it?
Thanks very much.
Not a big deal either way, but I would lean towards solving this by log levels (not sure how the code is currently handling logs).
With increasing verbosity:
trace
debug
(e.g. reading from the file system)info
(e.g. writing to the file system)warn
error
Then it's a case of which level of verbosity you want to use, e.g. default would be displaying info
and above. If you don't want output then --quiet
will raise the minimum notification bar to only show warn
and --qq
would only show error
(i.e completely quiet unless something has gone horribly wrong). Obviously vice versa with -v
.
as you say, I think it's slightly unclear just because the string specifying warning
is a bit hidden - I saw the timestamp and jumped straight to the end to figure out the problem.
would hesitate to add colour (very nice, but a considerable time sink)
I've improved the error message for not specifying the full list of PDBs for the SSAP scores file in 2a10d5edd3ba0e0708effb61788ac35589948125.
@toluadeyelu I've fixed the HETATM-stripping problem in 6304bc86ff2029d26456e5b722f60f97b0b5e30d. It was just a stupid mistake of using the wrong PDB data in the code for handling the JSON input. I should add a testcase but haven't got time right now.
@sillitoe Yes - I already do logging at various levels (eg here; albeit not rigorously systematically) but (if I interpret your point correctly) I still think the Done
is that little bit more annoying, even if users can then look up the usage to find there's an options to silence it.
Of course, it may prove useful later on, particularly if the tools' final destination is to be frequently used for long, multi-part jobs. But less so if it's typically quick and simple. I don't usually want my ls
or grep
or SSAP
to tell me it's finished; I just want it to: do what I ask, alert me to problems ("Permission denied
") and stop.
Anyway, for now, since the actual problem we're trying to fix is that the warning-iness of the warnings has been insufficiently clear, I've just added a simple bold to the severity part of the log message (c91df091be10fcf88704a440f04d1547adefc1fb).
Sorry: I should have explicitly said... It works! After the above fix, I was able to use the method we've discussed to superpose the whole PDBs mentioned above on their ...A00
domains and view their ligands in the superposition. Nice!
@tonyelewis - sounds great, many thanks for fixing so quickly.
@toluadeyelu - thanks for reporting and chasing up - your efforts have improved our group's software! Great work
@toluadeyelu - you will need to download the latest version of the binary, then try the command again.
@sillitoe @tonyelewis Thank you so much. Would give the feedback once I get it done this morning,
@tonyelewis Thank you so much. I have used this and it works brilliantly well. Now I am happy. Thanks @sillitoe. In the conversion of the Json to pymol I only used the full PDB for the one with the ligand while I used the domain structure of the others as suggested by @sillitoe .
@toluadeyelu Great - I'm really glad to hear this is working for you now. If you think you'd benefit from not having to use this workaround, feel free to "Add your reaction" → "+1" to the initial comment on issues #1 and/or #3.
Is everyone happy for me to close this issue? Please shout if not.
Ta.
@toluadeyelu had a query that I said I would add here for future documentation.
He has generated a multiple structure superposition of CATH domains.
He would like to add ligands and binding sites back into the structure.
There may be a more elegant solution in the pipeline (e.g. #3), however my suggested approach in the meantime was something like the following:
--sup-to-json-file
)cath-superpose
to apply the same operations to the full PDB structure (rather that the domain atoms)Does that sound reasonable?
(edit: removed the mentions of 'foreach' for clarity)