Closed richardjgowers closed 1 year ago
This would be very useful indeed! The following blog post might be relevant for further discussion: Why the RDKit isn't available on PyPi.
Yeah rdkit via conda is the only way to stay sane tbh. So this project might end up in a side package/non default submodule which is optionally installed and does some monkey patching to _CONVERTERS
.
At least on our end, rdkit integration would be very much appreciated. It would definitely make tools that depend on both packages like lintools (which we still have aims to revive properly) a lot easier.
I like the idea of expanding on converters – working towards API interoperability instead of files. It's the future.
We might have to review our policy on package dependencies in the core. Maybe it's ok for the majority of users to get a well-define failure if they try to do something with a specialized reader/converter, especially if they are told what to do if they want to install the package. I am thinking along mocking missing optional packages in such a way that only users who want to use exactly the functionality get a failure.
Perhaps the policy can be changed as to "packages that are used in a single convertor or reader can be optional". Or we make converters a submodule with its own policy (but it would be convenient to also be able to include readers/parsers with exotic dependencies).
Do you know anyone from the RDKit community who might want to co-mentor for GSoC? That would be extremely valuable.
RDKit usually takes part to GSoC under the OpenChemistry organization. You can have a look at the RDKit Project Ideas for past mentors. They have a Slack channel and @greglandrum is on it. Maybe worth a try?
I use rdkit enough at work that I can find my way around it, but if Greg wants to give his blessing that’s fine too :)
On Fri, Jan 24, 2020 at 22:28, Rocco Meli notifications@github.com wrote:
RDKit usually takes part to GSoC under the OpenChemistry https://www.openchemistry.org/ organization. You can have a look at the RDKit Project Ideas http://wiki.openchemistry.org/GSoC_Ideas_2019#RDKit_Project_Ideas for past mentors. They have a Slack channel http://openchemistry.slack.com and @greglandrum https://github.com/greglandrum is on it. Maybe worth a try?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MDAnalysis/mdanalysis/issues/2468?email_source=notifications&email_token=ACGSGBZSQAHVED7VBFFYT4TQ7NTRPA5CNFSM4KLELL7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ4JFXY#issuecomment-578327263, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGSGB7BJ7SVBGFN67YFYZDQ7NTRPANCNFSM4KLELL7A .
This would be cool! I'd be happy to try and help with answering RDKit-related questions. It's hard to commit to co-mentoring though, until we see what projects actually end up being funded I won't know how busy I am.
Hello,
I am investigating protein-ligand interaction with MD simulations and the idea of combining RDkit with MDAnalysis is extremely exciting.
Currently, I am using ODDT to do that (https://oddt.readthedocs.io/en/latest/rst/oddt.html#module-oddt.interactions
). The problem with this approach is that ODDT requires PDB files (which are then transformed into RDkit/pybel objects) meaning that one has to convert each frame into a PDB before moving into processing which is slow and bad practice. The idea behind it is that you provide the receptor and the ligand and it returns a list of descriptors such as donor atom, acceptor atom, atom types, etc.
One could feed the trajectory into the new RDKit wrapper and use ODTT in the loop to get the data. This implies that the wrapper must be able to take in the protein atoms and transform them appropriately into an RDKit object. Is that something that is envisaged because it would be of great use to a lot of people? Such functionality would also allow us to hardcode more complex rules and study/ measure less popular/ frequent interactions that are not captured by ODDT such as anion-pi interactions but are also important in protein-ligand binding. It would also combine extremely well with Native contacts analysis https://www.mdanalysis.org/docs/documentation_pages/analysis/contacts.html
.
Many thanks,
Harold
Thanks for doing this work! This is really exciting from Open Force Field's perspective. We've also been struggling to perceive bond orders from elements+connectivity (and to know when we can or can't safely guess), so the careful writeups and test cases in this project have been particularly cool to see :-)
@haroldgrosjean please have a look at https://www.mdanalysis.org/2020/08/29/gsoc-report-cbouy/#demo — it sounds to me that this will cover your use case. Note that not everything is working yet because not all of @cbouy 's PRs are merged yet but it will come.
@cbouy might be able to say more — the MDA/RDKit project has really made big leaps since you posted your comment in July (sorry for the long silence).
EDIT: Also have a look at @cbouy 's blog, especially https://cedric.bouysset.net/blog/2020/08/07/rdkit-interoperability
Hi Harold,
I think this would be a good question for the mailing list.
I’m not sure if SMARTS selections are already in develop. In the mean time you could check the .elements attribute of the atomgroup, which should exist when you have a rdkit molecule.
Oliver
Am 12.11.2020 um 09:30 schrieb Harold Grosjean notifications@github.com:
Hello,
I have started to use the RDKit wrapper to code my contact analysis. I was wondering if there is any way to check whether a given atom matches a smart pattern?
for example:
for atom_1 in group_1: for atom_2 in group_2: distance = distances.distance_array(atom_1.position, atom_2.position) if distance <= contact_max_dist: if atom_1 is 'smarts [F,Cl,Br,I]': #pseudocode do something #pseudocode This would considerably speed-up my code and I am sure it would be of use to other people.
Many thanks in advance.
Harold
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
I was going to answer but I'll wait for the mailing list thread so that more people can find the answer 😃 And yes, it's possible with some tweaks to your code example.
I'm not authorized to answer the discussion :( @orbeckst Should I post my answer here and you repost it on the thread, or can you authorize me to reply on the mailing list ?
Normally, you need to subscribe to the mailing list and when you reply the first time, an admin will remove the hold on your subscription (to make sure you aren't a spammer). However, I added you directly to https://groups.google.com/g/mdnalysis-discussion with your b...@gmail
address. Please try again.
is the issue still open
This is mostly an idea for a GSOC project. There's a bunch of cool stuff in rdkit which isn't even close to being in MDA (and vice versa) so rather than reinvent wheels, it would be cool to do:
and
Which would expand upon the converters idea that @lilyminium has got rolling with parmed.
The data structures are going to be very different, and rdkit is quite picky about what it lets you load, but it would be cool to get something going.