TripalCultivate / TripalCultivate-Phenotypes

Provides generic support for large scale phenotypic data and traits with importers, content pages and visualizations.
GNU General Public License v3.0
1 stars 0 forks source link

G4.58 - Phenotypes Share Values Validator #58

Open reynoldtan opened 7 months ago

reynoldtan commented 7 months ago

Branch

g4.58-Phenoshare-Values-Validator

Groups

Group 4 - API | Services | Plugins

Describe

Focuses on validating values in a data file submitted to Tripal Cultivate Phenotypes Share Importer.

There is a Plugin Type for validators which was created as part of #37. Each type of validation we want to do is an instance of this plugin. A scope is assigned to each validator instance to indicate what part of the file it validates and the order it should be in.

This issue is to design validator instances focused on validating the new phenotypes file format. The first attempt of this by Reynold, validated all values/columns in the file in a single validator. However, we would like to move to a different model.

Design

Create an instance of validator plugin: PhenoShareImportValues

 * @TripalCultivatePhenotypesValidator(
 *   id = "trpcultivate_phenotypes_validator_share_values",
 *   validator_name = @Translation("Phenoypes Share Importer Values Validator"),
 *   validator_scope = "SHARE IMPORT VALUES",
 * )

The plugin will validate the following:

  1. Trait Name - Trait name exists in the cv configured for trait in the genus selected. This column is required.
  2. Method Name - Method name exists in cv configured for method in the genus selected and is one of the methods paired to the trait name. This column is required.
  3. Unit - Unit must exists in the cv configured for the unit in the genus selected. This column is required.
  4. Germplasm Accession - Must exists in chado.stock table. This column is required.
  5. Germplasm Name - Must exists in chado.stock table. This column is required
  6. Year - A four digit value and must not exceed the current year. This column is required.
  7. Replicate - an integer value. This column is required
  8. Value - Use the unit data type to determine the data type for this column. A quantitative unit will require numeric value whereas qualitative value will be descriptive text. This column is required
  9. Data Collector - String, name of a person, institute or organization. This column is required.
laceysanderson commented 4 months ago

Here are the files from Reynolds first attempt at this:

These have been removed from the linked PR to allow it to be merged. We are going another direction here with a row-level validator rather then needing a helper and each validator looping through the entire file. Also the columns are changing which changes this validation in particular.

That said, there is important code in these files that will be useful so I am attaching them here for use later.

laceysanderson commented 1 month ago

Note: The validator plugin API was set up in this PR: https://github.com/TripalCultivate/TripalCultivate-Phenotypes/issues/37 and the following PRs are examples of simple validator plugin instances: https://github.com/TripalCultivate/TripalCultivate-Phenotypes/issues/48, https://github.com/TripalCultivate/TripalCultivate-Phenotypes/issues/47, https://github.com/TripalCultivate/TripalCultivate-Phenotypes/issues/41.