atviriduomenys / manifest

Atvirų duomenų struktūros aprašai
GNU Affero General Public License v3.0
13 stars 11 forks source link

Automaticly test if star rating is correct #24

Open aidiss opened 6 years ago

aidiss commented 6 years ago

PDFs should be ranked as 1. xls and xlsx should be ranked as 2 csv should be ranked as 3 can we check if RDF and SPARQL standarts are used, to give 4? can links to other data be automatically identified to give 5 stars?

5 star rating, for quick reference:

  1. Available on the web (whatever format) but with an open licence, to be Open Data
  2. Available as machine-readable structured data (e.g. excel instead of image scan of a table)
  3. As (2) plus non-proprietary format (e.g. CSV instead of excel)
  4. All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff.
  5. All the above, plus: Link your data to other people’s data to provide context
sirex commented 6 years ago

You mean, script should not allow you to submit CSV with 1 star even if you are pretty sure, that this CSV file should have 1 star?

Is it possible, that CSV could have 1 star? I think yes, for example CSV file could have only one field containing unstructured data, this makes whole CSV file ranked with 1 star.

aidiss commented 6 years ago

It seems it would be too difficult to make strict rules. What about soft warning? I guess vast majority of submitted CSV files will be with 3 star rating

sirex commented 6 years ago

Github and Travis CI does not support soft warning. Probably better idea, to not specify stars at all it is empty, then it will be autodetected. Bus also script should check if number of stars can be autodetected and if not, then it would complain, that stars property is required.