vvoelz / biceps

Bayesian inference of conformational populations
https://github.com/vvoelz/biceps
Other
12 stars 3 forks source link

To do list (BICePs 2.0 5/7/18) #15

Closed yunhuige closed 6 years ago

yunhuige commented 6 years ago
  1. check new scripts work smoothly (Done!)
  2. check new scripts can reproduce consistent results for the old dataset (Done!)
  3. change yaml format to something else (removed)
  4. parse argument for MBAR -- states (Done!)
  5. combine MBAR and plot scripts to reduce the time of loading data and make BICePs faster (Done!)
  6. start preparing documents (tutorials) (removed)
  7. check how to make a website like MDTraj and MSMBuilder (yes we want to make it online) (Done!)
  8. make BICePs support Python 3 (removed)
  9. figure out packages version for numpy, yaml (will be replaced), mdtraj, pymbar, matplotlib that BICePs can work well (Done!)
  10. we need to generate scripts that can convert people's raw data to the format that BICePs can understand and use (removed)
  11. we may need to make some test scripts for people to double check if they are doing things correctly (optional) (removed)
  12. clean up comments in source codes (removed)
  13. modify Posterior sampling scripts to get rid of unnecessary data from stored dictionary (removed)
yunhuige commented 6 years ago

For point No.9: Here is the full list of what packages we need to install through either anaconda or pip to get BICePs work: Pymbar --> 3.0.3 (tested!) MDTraj --> 1.9.1 (tested!)

Packages below are also necessary but are installed along with anaconda: yaml --> 0.1.7 (tested!) matplotlib --> 2.1.2 (tested!) Numpy --> 1.14.0 (tested!)

robraddi commented 6 years ago

After running a few tests I've come to the conclusion that memory issues will only occur when "nsteps" is very large. I was able to get the memory tests to work properly. Memory tests suggest that there is very little memory usage up until sampling.

yunhuige commented 6 years ago

This is a known issue for a long time and we couldn't really fix it since more steps we run for MCMC, more possible we can achieve convergence. We need to figure out if there is other issue that contributes this memory issue or not. Also, we need to try different ways to reduce memory usage for each functions. Worst case scenario, we need to find a good point to balance this memory and convergence.

yunhuige commented 6 years ago

We finished most part in this list and some of them are expected to be finished near the very end. So I'm going to move them out of this list and make separate issues for them.

yunhuige commented 6 years ago

Here is a summary of what we've done so far for these issues: 1/ the new scripts work smoothly and by saying that I mean no errors to interrupt the calculation. The test data set is cyclic beta-hairpin ligand 1 of MDM2 using cs_H, and cs_Ha experimental observables.

2/ Same data set I used as mentioned in 1/ and after 1M steps MCMC I got a very close BICePs score and population to the old one I got using old scripts. More tests may be needed in the future. But as we are going to reshape and organize the scripts again I will pause my test until we finish that part.

3/ will be finished later.

4/ Done! MBAR scripts now have a new argument for number of states. We don't have to hard code that ever again.

5/ Done! Check more details in test_MABR folder: https://github.com/vvoelz/nmr-biceps/tree/master/BICePs_2.0/test_MBAR

6/ save it for later

7/ Here is the link we may start with: https://readthedocs.org
We can find some templates as some references. We can do it in the end.

8/ Rob pointed this out and I think this is a very good idea! Again we need to finish all of our work for Python2.7 first and do 3.4 later.

9/ Check my earlier comments. May need to test more versions though.

10/ I will work on that soon!

11/ It's an option for us. It will be better if we have it for people.

12/ Either I or Rob will do that today.

13/ I will check what functions we can merge.