maxlath / wikibase-cli

read and edit a Wikibase instance from the command line
MIT License
227 stars 24 forks source link

How to create a duplicate of an item? #149

Open tuukka opened 3 years ago

tuukka commented 3 years ago

I'm trying to copy items from one wikibase to another. (I have copied properties manually first.)

To begin with something simpler, I tried to create a duplicate of an item within one wikibase: wb create-entity -v -s "Test: Copy [[Item:Q1234]]" <(wb data Q1234 --simplify | jq '{type, labels, aliases, descriptions, claims, sitelinks}'):

invalid monolingualtext value { property: 'P6', value: "example" }

Removing --simplify does not work either: wb create-entity -v -s "Test: Copy [[Item:Q1234]]" <(wb data Q1234 | jq '{type, labels, aliases, descriptions, claims, sitelinks}'):

missing snak value { property: 'P13', value: undefined }

Is this even feasible? My case is quite simple but apparently, Wikidata subsetting is a whole topic of its own: https://www.wikidata.org/wiki/Wikidata:WikiProject_Schemas/Subsetting

maxlath commented 3 years ago

did you have a look at wb generate-template?

wd generate-template Q4115189 --create-mode --json | wb create-entity -v -s "Test: Copy [[Item:Q1234]]"
tuukka commented 3 years ago

Yes, thank you! For new users, this could be mentioned here: https://github.com/maxlath/wikibase-cli/blob/master/docs/write_operations.md#wb-create-entity

I was able to finish the copying process but I had a few bumps even after finding generate-template:

  1. When copying within one Wikibase, it may fail because of the label+description uniqueness constraint. As a workaround, I prepend "Copy of" to the labels: wb generate-template --create-mode --format json Q1234 | jq '. as $entry | .labels | map_values("Copy of " + .) as $labels | $entry | .labels = $labels' | wb create-entity -v -s "Test: Copy [[Item:Q1234]]"
  2. When copying to another Wikibase, an item being copied cannot refer to other items that haven't been copied themselves yet. To solve this, I first copy the entities without their claims, and then copy the claims as another step.
  3. If an item does not exist, create-template will return "{}" which create-entity cannot handle.
  4. If an item does not exist, create-template will return just the id, which edit-entity cannot handle.

What worked in the end, the first step to copy items without claims:

for i in `seq 1 100`; do
  JSON="$(wd generate-template --create-mode --format json Q$i | jq 'del(.claims)')"
  if [ "$(echo "$JSON" | jq length)" != "0" ]; then
    wb create-entity -v -s "Copy from old wikibase" "$JSON" || break
  else
    wb create-entity -v -s "Copy from old wikibase" '{"labels": {"en": "(unused id)"}}' || break
  fi
done

The second step to copy claims to items:

for i in `seq 1 100`; do
  JSON="$(wd generate-template --format json Q$i)"
  if [ "$(echo "$JSON" | jq length)" != "1" ]; then
    wb edit-entity -v -s "Copy from old wikibase" "$JSON" || break
  fi
done

I did the same to properties manually first, but similar steps could work for them and then we'd have a simple way to subset Wikidata! :-)