kaiiam / mifc

A minimum information standard checklist formalizing the description of food composition data and related metadata.
MIT License
2 stars 1 forks source link

Add enumeration for `component_derivation_type` #10

Open kaiiam opened 3 weeks ago

kaiiam commented 3 weeks ago

The attribute measured_compound_derivation_type (can be renamed if necessary) is likely to be an important part of the MIFC standard, denoting of a value is analytical, calculated, inferred/estimated, sourced from literature, labelled data, etc. As such I think it should be a required field.

See the FDC SR legacy csv export's food_nutrient_derivation.csv file with a longer list of possibilities (79). Some of those could probably be broken up with extra additional supporting fields to capture other metadata. For example we may want to have another source field (e.g. measured_compound_derivation_source) for if the data was from literature or external source to say where it was from doi or name of institution perhaps? Other metadata fields like specific retention factors uses/not used for concentration adjustments).

For the moment the enumeration could look something like the following:

analytical
calculated (break this down into from other foods and from a linear regression etc)
literature sourced (this could also just be analytical or another type which is sourced from elsewhere)
inferred/estimated (from ingredient list/recipe, foods physical composition, another/similar foods etc) 
label claim
assumed zero

Could also break it down to have a field from if the measured_compound value is original or if it's externally sourced. Not sure how that would fit with calculated/inferred values. Much more to be done here. Could use additional expertise.

kaiiam commented 3 weeks ago

Note measured_compound_derivation_type is now renamed to component_derivation_type.

kaiiam commented 1 day ago

In order to accommodate food label data, component_derivation_type should probably be a required field and be used to differentiate analytical data from label claims, or other things like calculated values etc.