spase-group / spase-base-model

Specification for the SPASE base information model.
Apache License 2.0
1 stars 0 forks source link

Add Big Data terms to ProcessingLevel #2

Closed toddking closed 2 years ago

toddking commented 3 years ago

Add terms to ProcessLevel to support Big Data processes. Initial terms are:

AnalysisReady : Calibrated data that has been mapped on uniform space-time grid scales with gaps, flags and out-of-range values replaced with appropriate values. (Similar to NASA Level 3)

MachineLearningTraining: A set of data which is used to train a model, for example to set or update the weights in a the model.

MachineLearningValidation: A set of data which is used to determine the fit of a model, often used to reduce overfitting.

MachineLearningTest: A set of data that is only used to test the final output of the model in order to confirm the accuracy. This set of data is used to detect if the model was unintentionally made to peek on the validation data and resulted in overfitting.

toddking commented 3 years ago

It might be more elegant to have "MachineLearning" be enumerated with "Training", "Validation" and "Test". For example "MachineLearning.Test".

toddking commented 2 years ago

Completed in 2.4.0