diegodlh / zotero-cita

Cita: a Wikidata addon for Zotero with citations metadata support
GNU General Public License v3.0
235 stars 12 forks source link

Titles longer than 250 characters will always fail QID reconciliation #98

Open diegodlh opened 3 years ago

diegodlh commented 3 years ago

Some titles may be too long for Wikidata item labels (#97).

In these cases, using the Wikidata reconciliation servics with the title as query would fail (would not return a QID), because the Wikidata item label would be a shorter version.

@lightgivener asked if P1476 (title) could be searched as well. I tried adding {pid: "P1476", v: item.title} to the queryProps array, but this would only be used to match against candidates already retrieved by label.

An alternative related to #97 would be to use the Short Title field as query, if available.

Possibly related to #84 as well.

diegodlh commented 3 years ago

@lightgivener asked if P1476 (title) could be searched as well

Actually, P1476 should be searched already, because MediaWiki API's action=query&list=search (which the reconciliation service already uses) should search page content. However, P1476 seems to be ignored. I posted about this here.

diegodlh commented 3 years ago

Posted an issue to the openrefine-wikibase repo: https://github.com/wetneb/openrefine-wikibase/issues/116

diegodlh commented 3 years ago

As a workaround, to minimize the chances of getting an unexpected empty results array from the reconciliation service, consider refusing to reconcile items with a title longer than 250 characters, which don't have an alternative short title.

diegodlh commented 3 years ago

DOI 10.1145/1718918.1718942 (QID Q66711639) is a related example. Both if added through the Zotero Connector or using the DOI, Zotero saves the title as "Readers are not free-riders: reading as a form of participation on wikipedia". But the Crossref API says the title is "Readers are not free-riders", whereas the rest is the "subtitle". The Wikidata item's label and title are "Readers are not free-riders".