qiime2 / provenance-lib

QIIME 2 Provenance Replay Tools
BSD 3-Clause "New" or "Revised" License
3 stars 4 forks source link

Support data integrity bug identification #86

Open ChrisKeefe opened 2 years ago

ChrisKeefe commented 2 years ago

The existence of a queryable DiGraph (ProvDAG.dag) gives us the power to identify bugs (software, system, or user-originated) at the analysis level. Providing centralized infrastructure where developers can register known-buggy situations could help improve the reliability of QIIME 2 Results, and reduce both user and developer time spent diagnosing problems.

ChrisKeefe commented 2 years ago

It might be useful to think about bugs in terms of what scale of data we need to identify them. A bug might be identifiable

The two middle points seem the most likely, and are probably the most valuable to prioritize in development. (e.g. should bug-checking occur while creating ProvNodes, or higher up, at the parser level). The final point is probably best addressed by querying the full ProvDAG once parsing is finished, which might render failures later than is convenient for users with large Result collections.