datasets / publicbodies

A database of public bodies such as government departments, ministries etc.
http://publicbodies.org
MIT License
63 stars 26 forks source link

Greece: data has duplicate rows #84

Closed augusto-herrmann closed 6 years ago

augusto-herrmann commented 6 years ago

Goodtables detected some duplicate rows in Greece's data:

$ goodtables data/gr.csv 
DATASET
=======
{'error-count': 11,
 'preset': 'nested',
 'table-count': 1,
 'time': 0.047,
 'valid': False}

TABLE [1]
=========
{'encoding': 'utf-8',
 'error-count': 11,
 'format': 'csv',
 'headers': ['id',
             'name',
             'abbreviation',
             'other_names',
             'description',
             'classification',
             'parent_id',
             'founding_date',
             'dissolution_date',
             'image',
             'url',
             'jurisdiction_code',
             'email',
             'address',
             'contact',
             'tags',
             'source_url'],
 'row-count': 962,
 'scheme': 'file',
 'source': 'data/gr.csv',
 'time': 0.046,
 'valid': False}
---------
[560,-] [duplicate-row] Row 560 is duplicated to row(s) 553
[571,-] [duplicate-row] Row 571 is duplicated to row(s) 562
[586,-] [duplicate-row] Row 586 is duplicated to row(s) 567
[596,-] [duplicate-row] Row 596 is duplicated to row(s) 562, 571
[599,-] [duplicate-row] Row 599 is duplicated to row(s) 575
[607,-] [duplicate-row] Row 607 is duplicated to row(s) 565
[625,-] [duplicate-row] Row 625 is duplicated to row(s) 575, 599
[632,-] [duplicate-row] Row 632 is duplicated to row(s) 565, 607
[634,-] [duplicate-row] Row 634 is duplicated to row(s) 562, 571, 596
[638,-] [duplicate-row] Row 638 is duplicated to row(s) 575, 599, 625
[715,-] [duplicate-row] Row 715 is duplicated to row(s) 714

@okfngr, or anyone else who can verify this, is it ok to delete the duplicated rows in the data?

augusto-herrmann commented 6 years ago

@todrobbins, I think it's safe to accept the PR, right?

todrobbins commented 6 years ago

Thanks for all the recent work, @augusto-herrmann and @rufuspollock! I need to check my GitHub notifications more often...

todrobbins commented 6 years ago

@augusto-herrmann, are we good to close this issue due to #89?

augusto-herrmann commented 6 years ago

Sure! I thought I had done so already. It seems that I forgot to for some reason.