Closed molsim closed 7 years ago
Made a quick fix to issues 2 and 3 by changing lexer.l rules via molsim@cb27bab9bd84fcb7bd7a899cde15f0edc1008d1b
Yes that's the right place to change it. 1. and 3. should be fairly straightforward to fix and your suggestion for 3 seems correct. For 2 one has to consider that we use minus signs for ranges too i.e. resi 1-3
. If we allow negative numbers the syntax resi 1-3+5-7
probably won't work anymore, but I think allowing negative residue numbers does make sense, so I'll look into modifying the rules here.
Thanks for pointing this out!
Regarding 2, PDB actually uses negative resi (eg. PDB 1KX5) Pymol or other programs usually handle this via escaping the minus sign, e.g. "chain i and resi \-10-\-5" - this will select resi between -10 and -5.
The last commit to the dev branch adds this functionality. It passes all the tests I could come up with at least, if it works for you too I'll merge it to master eventually.
Thank you! I'd also suggest making selection ids (keys) even more flexible, currently they are restricted to alphanum and will not work if contain only digits. Using "-72" as selection id is not parsable now. Here is an adhoc fix I use now to allow for that molsim@ eda742451110ded7b0119132b55eaa10c75c3a7f
PS just realized that github syntax in one of my comments got confused with escaping the minus sign too. Pymol selections syntax to my knowledge behaves as follows: "resi \-10" selects residue -10 "resi -10" will select all residues with resi <= -10.
[x] No reason we couldn't have dashes/minus signs in selection names, I will open for that. It is probably simplest to make selection IDs a separate lexer class, as you have done, to not confuse it with minus signs in ranges.
[x] It makes sense to have open-ended ranges and then escape the minus sign for negative indices (I assume resi -10
means resi <= 10, not <= -10?). I would also allow resi 10-
then. This requires both extension of the parser grammar and the selection code itself, so might be a few days before I have time to look at it, but I'll let you know when I have something in place (unless you want to try it yourself).
Thank you, Simon! Yes, I again got confused with the syntax, you are correct "resi -10" should select resi <= 10.
There are now less restrictions on selection names, now alphanumeric characters, '+', '-' and '_' are allowed in any order.
Pushed code that allows open ended ranges and where negative indices are escaped with backslash (see changes to the files changelog or doxy_main.md in the commit above for details). I added quite a few tests, so I am relatively confident this works as intended, but if you have the chance to do some sanity checks too, that would be great!
Will close this for now, let me know if you discover any further selection problems.
Thank you, selections work for fine for me now!
Hi, I am really sorry for bothering you with stupid question, but ..
Could you please give a simple example, how to 'feed' your script via "--select" option? I think I am guite familiar with PyMOL, but I can not figure out, which COMMAND should be put there, e.g., when I want to select multiple atoms with indexes..
Thank you in advance. Best, VM.
Hi, There is no option at the moment to select atoms by index. A simple example to select residues by index would be
freesasa 1abc.pdb --select="selection_name, resi 1-4"
Full documentation of the subset of PyMOL commands available in FreeSASA can be found here http://freesasa.github.io/doxygen/Selection.html
Thanks, .. and sorry, I somehow missed that page in your manual.
.. ok, indexes are not supported, but one can use atom names instead. Some relabelling is needed though because our friend Gromacs likes to use atom names with numbers in front. :-)
It's probably possible to allow atom names to start with numbers in the selection. Do you have an example file I could use?
I will look into adding atom selection by index too, but that's a larger project.
Is there a way to select area using cofactors, for example FMN?
I tried using resn because FMN is in the same column as the amino acids with this code:
FMNselect = freesasa.selectArea(['FMNarea, resn FMN'], structure, result)
However, I keep getting this from the CMD line:
FreeSASA: warning: Found no matches to resn 'FMN', typo?
Hi, that’s probably because it’s HETATMs. You can include those as a flag when you init the structure.
Could you explain how to do to this?
This is what I've tried so far:
structure = freesasa.Structure(filename) addatom = freesasa.Structure.addAtom(residueName = "FMN")
But I'm getting this error from the command line:
TypeError: descriptor 'addAtom' of 'freesasa.Structure' object needs an argument
Here’s a line from the test suite that tests hetam inclusion:
Structure("lib/tests/data/1ubq.pdb",None,{'hetatm' : True})
This worked! Thank you so much for your help!
I'd like to be able to calculate the SASA of individual hydrogen atoms in protein DNA complexes (e.g. nucleosome PDB 1KX5).
Looks like there are several problems with parsing selection syntax in the function below:
freesasa.selectArea(['1T, (chain I) and (resi -72) and (resn DC) and (name H5\'\')'],structure, result)
1) selection names cannot start with a number "1T" vs "T1". 2) negative resi are not supported "-72" vs "72". 3) hydrogen atom names with primes are not recognized H5'' is treated as H5' or H5.