Closed matkoniecz closed 3 years ago
I planned to write script that will primarily read data, but from what I see in all cases reading data is not done at all or done using SPARQL.
https://github.com/maxlath/wikidata-scripting/tree/master/youtube_links - reads no data - probably will either fail if claim (possibly conflicting one) is existing or it will be overriden.
https://github.com/maxlath/wikidata-scripting/tree/master/import_arbitrary_data_to_wikibase - seems that deduplication is supposed to happen before this script is run
https://github.com/maxlath/wikidata-scripting/tree/master/import_writers_pseudonymes_from_dbpedia - reading happens via SPQRQL
yes, we currently lack some "smart patching" features, all the data control is indeed expected to be done ahead
This is not a problem for me - rather about is there a good way to get value of a specific claim and so on.
a possibility to get data in a script could be to use wb data
, and eventually manipulate the result with something like jq:
# get all Q1 data
wb data Q1
# get all Q1 data in a simplified format
wb data Q1 --simplify
# get all Q1 P1424 claims data
wb data Q1#P1424
# get all Q1 P1424 claims data in a simplified format
wb data Q1#P1424 --simplify
# get the data for the claim identified by Q1-9741c622-fc42-4646-96ec-c594933d74c0
wb data Q1-9741c622-fc42-4646-96ec-c594933d74c0
# get the data for the claim identified by Q1-9741c622-fc42-4646-96ec-c594933d74c0 in a customized simplified format
wb data Q1-9741c622-fc42-4646-96ec-c594933d74c0 --simplify --keep ids,references,qualifiers,hashes,nontruthy
And capture stdin for processing? I guess that main worry is that I still need to manually parse text to hadle errors.
you could either check stdin
Q1_P31_claims_data=$(wd data Q1#P31)
if [[ "$Q1_P31_claims_data" != "" ]] ; do
echo "Q1 has a P31 claim"
else
echo "Q1 doesn't have a P31 claim"
fi
or the exit code, which will be 1
if no value can be found
wd data Q1#P31 > /dev/null && {
echo "Q1 has a P31 claim"
} || {
echo "Q1 does't have a P31 claim"
}
OK, thanks for answering! I think that I will close it.
Thanks again!
I plan to make small scale crawling script (between 5 000 and 25 000 pages), is it an usable tool for that or would you recommend an alternative?