GoldenCheetah / OpenData

A project to collect, collate and share an open data set with contributions from users of the GoldenCheetah application
38 stars 6 forks source link

Errors in data #14

Open AartGoossens opened 5 years ago

AartGoossens commented 5 years ago

I came across 3 types of invalidities in the data:

  1. Filenames that do not match the yyyy_mm_dd_HH_MM_SS.csv format.
  2. Metadata files that are not valid json.
  3. Activity files for which there is no metadata.

I attached a gzip (sorry, Github did not accept plain csv files) to this issue with every occurrence of each of these errors.

Although I can (and probably will) add proper error handling to the Python library so it does not stumble over these errors I think it might be worth taking the time to fix these errors in the data so people that try to work with the data do not have to manage these errors.

invalid_data.tar.gz

AartGoossens commented 5 years ago

@liversedge I just published version 0.3.0 of goldencheetah-opendata that handles these errors. Do you think it is worth it looking into handling/preventing/fixing these errors in GC before uploading? In that case I could take a look at it.

liversedge commented 5 years ago

Oh definitely. If you want to look at it do feel free, although it can probably wait till I go in there. I need to fixup loads of things, like collecting height, performance test indicators and so on. I was also thinking about enhancing the dialog box to insist on correct gender, age and weight data.