Closed kheal closed 3 months ago
Who is this assigned to? @kheal
We haven't assigned this yet, it should not be on the current sprint board.
For documentation purposes: we are changing these class and slot names because it misrepresents what is happening. Metabolites are not being quantified, but only identified.
PS are we really planning on putting records for metabolite identification in MongoDB? We don't do anything like that for genomic results. The data volume could be huge.
@SamuelPurvine and I have been talking about saving proteomics results somewhere outside of MongoDB, or at least removing some level of detail from the records, like the qualified lists of all possible peptide identifications.
@SamuelPurvine and I have been talking about saving proteomics results somewhere outside of MongoDB, or at least removing some level of detail from the records, like the qualified lists of all possible peptide identifications.
More directly, we already save the proteomics results outside of MongoDB, in the Peptide_Report and Protein_Report tsv files that are data objects produced by the workflow. We originally thought to put proteomics results into Mongo as there had been thought that these would be used by some NMDC to-be-developed aggregation tools that would pull those results from the DB, on the fly, to allow the user to "do some cool stuff in the portal".
A/the plan going forward is to pare the Analysis_activity results we report/load into MongoDB using json to just the BestProteins identified from a workflow run/instance, allowing a mild re-factoring of the aggregation table (removes the best_protein boolean which really ought to be is_best_protein to denote a question being answered), and an overhaul of the aggregation code to simply group the functional annotations associated with the BestProteins for a given workflow instance and count the number of BestProteins per functional annotation. This can also help drop the PeptideQuantitfication and ProteinQuantification classes and associated slots (who DOESN'T like dropping classes??).
has_metabolite_quantification
slot onMetabolomicsAnalysis
should be renamed tohas_metabolite_identification
MetaboliteQuantification
class should be renamed toMetaboliteIdentification
metabolite_quantified
slot onMetaboliteQuantification
class should be renamed tometabolite_identified
This will require a migration