internetarchive / openlibrary

One webpage for every book ever published!
https://openlibrary.org
GNU Affero General Public License v3.0
5.25k stars 1.39k forks source link

Integrate Wikidata #710

Open tfmorris opened 6 years ago

tfmorris commented 6 years ago

There's a ton of stuff that we could be leveraging Wikidata for in addition to just linking author records to Wikidata.

Etc, etc. Basically use Wikidata to bling it up without having to spend a lot of time/effort.

nichtich commented 6 years ago

First step would be to link corresponding Wikidata ID to works, editions, authors and subjects. Links can be added in Wikidata with two Wikidata properties:

To avoid synchronization headache it may make more sense to keep Wikidata as master for these links and harvest them regularly (plus live via SPARQL or MediaWiki API). Nevertheless OL should provide an editing interface to these links but directly edit in Wikidata via OAuth.

LeadSongDog commented 6 years ago

Second step could be to flesh out identifier lists and classification lists in the OL edition records using harvests from wikidata. This opens the door to finding other (non-IA) online-access copies.

xayhewalo commented 5 years ago

I think the steps are our good start but should be more granularly delineated. @hornc Your insight would be valuable in this thread.

tfmorris commented 5 years ago

@nichtich I'm surprised Open Library subject got approved as a Wikidata property. I recommend we discourage it's use since OL Subjects are a mess and going to change when we get around to either normalizing them or internationalizing them (or both). It has less than 600 uses now versus ~207,000 for the Open Library ID property.

@guyjeangilles I won't object if you want to break this into 5+ tickets, but that task could also be left until someone's ready to work on it in the spirit of Agile's just in time planning.

One thing I left off the original list was harvesting author birth & death dates, profession, AKAs, etc to help with disambiguation and photos for authors who don't have them.

xayhewalo commented 5 years ago

@tfmorris I'm not apposed to adopting Agile planning in the future, but considering our habit of leaving issues unattended for years, I'm assigning @hornc for the time being per slack discussion.

RayBB commented 3 years ago

Adding wikidata ids to works is blocked by #1797

RayBB commented 1 year ago

The first steps are awaiting review in #8236 I also have this handy wiki page with some idea and I've added yours there.

tfmorris commented 2 weeks ago

The first steps are awaiting review in https://github.com/internetarchive/openlibrary/pull/8236

So apparently that got closed and replaced by #9130 without linking it to this issue so that people could comment.

I also have this handy wiki page with some idea and I've added yours there.

Why use a wiki page instead of a series of sub-issues linked to this master issue (ie epic) so that they can be commented on? There's also apparently another secret version hiding here - https://docs.google.com/document/d/1-xAija9Pfhtwc-wCAgERHBx6GvFv40Hb-nfHd9f6SPc/edit