rifdock / rifdock

Rifdock Library for Conformational Search
Other
123 stars 41 forks source link

RNA structure prediction and docking to protein #151

Open ahorvath opened 4 months ago

ahorvath commented 4 months ago

Hi @bcov77,

I'd like to kindly ask your insight in complex docking task I'd like to achieve.

I have predicted an RNA structure (e.g. nstruct=50) from sequence (~80b) w/ and w/o specifying the RNAfold structure prediction as seed. Now I'd like benchmark these structures in term of energy minimum, bond angles, torsion angles, etc. to have a surrogate measure of "stability". I suppose this should be done in a solvent such as water. Can rosetta do such simulations alone or in combination with gromacs?

As a second step, I'd like to dock the RNA structure to a target protein and test binding allowing conformation changes in both the RNA and the protein. What approach would you suggest?

Many thanks in advance.

Best, Attila

bcov77 commented 4 months ago

For both of your tasks, I'd recommend state-of-the-art deep learning programs which aren't yet published. AF2-latest can do this but it's unclear if it'll ever be published. RoseTTAFold all-atom can sort of do these, but, we're slow and haven't gotten them published yet.

As to evaluating your RNA structure, I'm not an expert here. There are some rosetta labs that do this. You can absolutely calculate all the metrics that you want, as to whether or not those metrics correlate with the real world is a different matter. (And is an area where honestly I only trust DL models these days. The old Rosetta way to do this would be to predict like 10K structures with Rosetta and pick the lowest energy. But even that wouldn't be accurate in my opinion.). If you want to send an email out into the blue on this topic, you might try Andy Watkins (RNA isn't a super big field, so he might be excited to get an email)

For your second task, I can basically guarantee that there is no method available to you that will do this accurately. I'm sure you can get old classic methods to produce docks, but whether or not they mean anything is up for debate (you'll have no way to rank them). I know that at the moment RoseTTAFold can't do this accurately either. Probably the only tool in the world that would work here is AF2-latest, but as I said, it may never go public.

If you want to generate "something" and know that it's inaccurate. You could just go with one of the classic docking methods. IDK if rifdock can do this (in theory it can but I'm not sure if it specifically can), or maybe patchdock will do it (also unsure). But like I said, you'll have no way to evaluate the docks as to whether they are real or not.

bcov77 commented 4 months ago

There are probably other new DL tools out there that can do this btw if you search the literature. It just comes again down to whether they are trustworthy. That might be where to look

ahorvath commented 4 months ago

Many thanks for your detailed answer, it's quite useful for me to understand the field a bit more.

I'm excited to have a look at what AF2 can do, hopefully the code is accessible. Other option can be ARES (http://167.99.175.117/static/ares.html) which seems to the state-of-the-art. I will contact Andy Watkins and his opinion, too. I tested the webapp, it works for individual pdb-s but it gives an error on the rosetta silent files.

Where do you think GROMACS' limitations lie? Would that be more accurate on calculating lowest energy in water solvent?

As you suggested, I started generating the 10k structure with rosetta. For some reason the pdbs aren't working with the AMBER03/AMBER94 forcefield. Have you ever encountered this issue?

The GROMACS issue:

Fatal error: Residue 1 named G of a molecule in the input file was mapped to an entry in the topology database, but the atom O5' used in that entry is not found in the input file. Perhaps your atom and/or residue naming needs to be fixed.

ahorvath commented 4 months ago

The GROMACS issue might be that it expects an O5' atom in the starting base but rosetta doesn't do that.

bcov77 commented 4 months ago

I've never used GROMACS so you'll have to sort that one out yourself.

silentextract whatever.silent will get you your pdbs (but ofc it's annoying)

And as far as calculating the lowest energy state with GROMACS, I've never actually heard of anyone doing that. I think the Rosetta forcefield is better at ranking low energy structures. But coming from Rosetta-world, what I've seen is a little biased.