vega / vega-datasets

Common repository for example datasets used by Vega-related projects
264 stars 209 forks source link

Correct errors in monarchs.json or document them in SOURCES.md? #595

Closed dsmedia closed 1 month ago

dsmedia commented 2 months ago

There are a couple of factual errors in the (as-of-yet unsourced) timeline in monarchs.json posted by @arvind in 2015. Since README.md notes that

Datasets may contain intentional inconsistencies or errors to provide opportunities for data cleaning exercises and to illustrate common data quality issues.

should we leave the dataset as-is and reference the errors in SOURCES.md? Or should we correct the errors? Personally, I favor hosting accurate data as the default. In this case, it would be simple for an instructor to reintroduce errors in the dataset if needed for teaching purposes. Then again, correcting the errors may "break" existing examples that rely on the existence of errors.

[
-{"name":"Elizabeth","start":1565,"end":1603,"index":0},
+{"name":"Elizabeth","start":1558,"end":1603,"index":0},
{"name":"James I","start":1603,"end":1625,"index":1},
{"name":"Charles I","start":1625,"end":1649,"index":2},
{"name":"Cromwell","start":1649,"end":1660,"commonwealth":true,"index":3},
{"name":"Charles II","start":1660,"end":1685,"index":4},
{"name":"James II","start":1685,"end":1689,"index":5},
{"name":"W&M","start":1689,"end":1702,"index":6},
{"name":"Anne","start":1702,"end":1714,"index":7},
{"name":"George I","start":1714,"end":1727,"index":8},
{"name":"George II","start":1727,"end":1760,"index":9},
{"name":"George III","start":1760,"end":1820,"index":10},
-{"name":"George IV","start":1820,"end":1820,"index":11}
+{"name":"George IV","start":1820,"end":1830,"index":11}
]
domoritz commented 2 months ago

I think correcting this one is fine. We should check the example this is used for and how it compares to the original.