Open ChrisKeefe opened 2 years ago
Target version information is critical for use of pre-trained classifiers with scripts that rely feature-classifier classify-sklearn
. If the sklearn version has changed between the original analysis and the current version of feature-classifiers
, this old friend will drop in to raise trouble:
Plugin error from feature-classifier:
The scikit-learn version (0.23.1) used to generate this artifact does not match the current version of scikit-learn installed (0.24.1). Please retrain your classifier for your current deployment to prevent data-corruption errors.
Users need to be able to track down a usable pretrained classifier for their analysis, and a target-version report will help with that.
Scripts are currently produced targeting a specific QIIME 2 distribution. This distribution information should be reported whenever scripts are generated, and should be written directly to the script itself, or to a positively identifiable supplement.
Note that the same Results could be replayed multiple times, targeting different distributions, so the Result UUIDs are not useful for identification. Maybe this identity could be managed by assigning a run UUID to all outputs from a given run of provenance_lib?