everypolitician / compare_with_wikidata

Library for diffing Wikidata and CSVs
MIT License
2 stars 0 forks source link

Empty records in comparison #98

Open tmtmtmtm opened 7 years ago

tmtmtmtm commented 7 years ago

I created https://www.wikidata.org/wiki/User:Oravrattas/prompts/18th_Bundestag to compare the IDs of the members of the 18th Bundestag in Wikidata with those in EveryPolitician. For now I simply restricted the comparison to IDs, as we know that most of the other fields aren't well populated in Wikidata yet, so I wanted to at least get the list of members consistent before expanding to those.

However, two of the people found in the CSV, but not in SPARQL, are being shown with no ID/text/link:

screen shot 2017-09-25 at 12 42 27

chrismytton commented 7 years ago

@tmtmtmtm Looks like there are two people in the CSV you've used that don't have anything in the wikidata column. You can verify this with the following command:

curl -s 'https://raw.githubusercontent.com/everypolitician/everypolitician-data/master/data/Germany/Bundestag/term-18.csv' \
  | q -H -d, -O 'select * from - where wikidata = ""'

I'm not sure what we can do in these circumstances, since the output is technically correct. Any suggestions?

tmtmtmtm commented 7 years ago

Hmm. Yes, that's an awkward one. I think this is going to be a fairly common case, though, so it would be good to do something when this happens. I can't really think of what, though!