ModelSEED / ProbModelSEED

Other
2 stars 3 forks source link

Add probabilistic annotation algorithm #23

Open mmundy42 opened 9 years ago

mmundy42 commented 9 years ago

Here's the design:

  1. Add a ProbAnnotationWorker class that implements the algorithm. The class has methods that correspond to the steps in the algorithm as documented in the probabilistic annotation paper. Temporary files are stored in a separate job directory.
  2. Add a ProbAnnotationParser class to access the static database files used by the algorithm. The static database files can either be downloaded from Shock or preloaded on the system. Note that creating the static database files is still done by the probabilistic_annotation service.
  3. Add a ms-probanno script that retrieves the genome from the workspace, runs the algorithm, and stores the rxnprobs in the workspace.
  4. Add a probanno parameter to the ModelReconstruction() method. When the probanno parameter is non-zero, store the genome in the model folder so it can be used as input to the probabilistic annotation and run the ms-probanno script.
  5. Update the FBAModel object to include a rxnprobs_ref attribute. When the ms-probanno script is successful, the rxnprobs_ref is set to the rxnprobs object stored in the model folder.
  6. Update the GapfillModel() method to pay attention to the probanno parameter. When the probanno parameter is non-zero, the rxnprobs are passed along to the MFAToolkit by building a objective coefficient file. This a direct port of the previous code.
mmundy42 commented 9 years ago

Created pull request #27 which has the first pass of adding the probabilistic annotation algorithm. I also updated the design above. This first pass should have no changes to the way the methods are currently working by default. I still need to push the static database files to the p3 shock server and test the automatic download of the files.