Open MaximMoinat opened 2 years ago
On this tool, we are assuming that all data coming from CatalogueExport is correct, if it is not, shouldn't this be corrected at the data generation level there (CatalogueExport)?
We could indeed do a clean up in the R script of the CatalogueExport. However, there we do not know what kind of visualisations are made and what outliers would create issues. So ideally, each (new) visualisation has some expectations on in what ranges it expects data. This would then have to be implemented on the NetworkDashboard side.
Your solution as proposed in issue #232 would also work here.
Then, we can move these data checks and provide warnings when generating the data in the CatalogueExport. Still, it would be really nice if the NetworkDashboard also gives a warning/error when uploading unexpected data.
We are seeing some outliers in the DatabaseCatalogue due to data quality issues. For example, a negative Cumulative Observation Time (Observation Period). This makes the data visualisations hard to interpret.
One way to improve this would be to include checks on the file imported for the database dashboard. If e.g. negative times are found, then the user should get a message back. Some checks will only trigger a warning and others result in the upload being rejected.
Note: data quality checks are also being done by the DQD. We should not try to redesign that, but only focus on data issues that would give problems with the Dashboard visualisations.