biocompute-objects / BCO_Documentation

Repository for documentation to support the IEEE 2791-2020 standard. Please see our home page for communications/publications:
http://biocomputeobject.org/
BSD 3-Clause "New" or "Revised" License
16 stars 12 forks source link

ECO - Evidence and Conclusion Ontology #33

Open HadleyKing opened 6 years ago

HadleyKing commented 6 years ago

The error_domain should use ECO to describe the results. This would mean updating the text as well as creating a good example. @rajamazumder and @openbox-bio are currently working on an example that may be a good test case.

Other suggestions/comments welcome. Currently this is only described as follows:

The empirical error subdomain contains the limits of detectability, false positives, false negatives, statistical confidence of outcomes, etc. This can be measured by running the algorithm on multiple data samples of the usability domain or in carefully designed in-silico spiked data. For example, a set of spiked, well-characterized samples can be run through the algorithm to determine the false positives, negatives and limits of detection. The algorithmic subdomain is descriptive of errors that originated by fuzziness of the algorithms, driven by stochastic processes, in dynamically parallelized multi-threaded executions, or in machine learning methodologies where the state of the machine can affect the outcome. This can be measured in repeatability experiments of multiple runs or using some rigorous mathematical modeling of the accumulated errors. For example: bootstrapping is frequently used with stochastic simulation based algorithms to accumulate sets of outcomes and estimate statistically significant variability for the results.

Maybe we could incorporate this elsewhere too?