frictionlessdata / data-quality-dashboard

Data Quality Dashboards display statistics on a collection of published data.
Other
33 stars 10 forks source link

Scoring algorithm #2

Closed pwalsh closed 8 years ago

pwalsh commented 9 years ago

Briefly addressed in #1, we need a scoring algorithm based on results of data validation.

Can't be done without further discussion.

pwalsh commented 9 years ago

@rgrp @tryggvib

It would be good to address this properly (spec here, or user story) in this sprint (or towards end, going into next one).

In the results aggregator (https://github.com/okfn/spd-admin/blob/master/spd_admin/tasks.py#L97)

My "algorithm" is simply: start at score of 10, minus 1 per error, down to 0.

Obviously we'll want something a bit smarter: something that perhaps treats structural errors differently to schema errors, or, handles cases for example where one single mistake is repeated over multiple rows.

pwalsh commented 8 years ago

moved to https://github.com/okfn/data-quality-cli/issues/5