codeforgermany / click_that_hood

A game where users must identify a city's neighborhoods as fast as possible
http://click-that-hood.com
MIT License
449 stars 638 forks source link

Requirements for cleanup script #283

Open k-nut opened 8 years ago

k-nut commented 8 years ago

As announced in #282 I am currently creating a validation/cleanup script to run if new data is submitted.

The goal is to make sure that the geojson does not contain too many properties. The question that I have now is which properties should be allowed. It seems that most files contain the following: [ 'name', 'created_at', 'updated_at', 'cartodb_id' ]. Should I check for those? Also, are created_at and updated_at mandatory?

mwichary commented 8 years ago

I don’t think so, but I think CartoDB outputs them by default.

mwichary commented 8 years ago

Note that there are also some multilingual files. Those have more valid columns (the names of those columns are in the json file).

k-nut commented 8 years ago

I created a first version that does validation. It reads all the geojson files in the public/data directory and checks their metadata to see which languages should be in the geojson. It then reads the geojson and sees if there are any additional fields.

I added this as a build step on travis and imagine it running whenever someone sends a pull request. Ideally the script does not return any warnings but if the pull request introduces new unnecessary fields one would be made aware of it.

An example of the output can be found here: https://travis-ci.org/k-nut/click_that_hood#L170 The code is in my branch

What do you think of this?

k-nut commented 8 years ago

@mwichary any feedback on this? I still think that it could be useful if you would like to keep the file size small and not include any unnecessary fields. But I am also open to other ideas or you saying that you do not see a need for this.