Closed augusto-herrmann closed 6 years ago
Removing duplicated rows detected by goodtables.
Now the file validates.
$ goodtables data/gr.csv DATASET ======= {'error-count': 0, 'preset': 'nested', 'table-count': 1, 'time': 0.08, 'valid': True} TABLE [1] ========= {'encoding': 'utf-8', 'error-count': 0, 'format': 'csv', 'headers': ['id', 'name', 'abbreviation', 'other_names', 'description', 'classification', 'parent_id', 'founding_date', 'dissolution_date', 'image', 'url', 'jurisdiction_code', 'email', 'address', 'contact', 'tags', 'source_url'], 'row-count': 951, 'scheme': 'file', 'source': 'data/gr.csv', 'time': 0.057, 'valid': True}
I don't see any reason to keep duplicated rows in the dataset. If they are different entities in any way, there should be a column to make the difference explicit instead of keeping two records that are exactly the same.
OK, i'm merging.
Removing duplicated rows detected by goodtables.
Now the file validates.
I don't see any reason to keep duplicated rows in the dataset. If they are different entities in any way, there should be a column to make the difference explicit instead of keeping two records that are exactly the same.