Living-with-machines / lwmdb

A django-based library for managing the Living with Machines newspapers metadata database schema
https://living-with-machines.github.io/lwmdb/
MIT License
2 stars 0 forks source link

Manually link the geographic data #23

Open mcollardanuy opened 2 years ago

mcollardanuy commented 2 years ago

Organise the manual linking of the following data:

kallewesterling commented 2 years ago
+ Documenting from Slack

@mcollardanuy to team:

We’re getting closer to having the newspaper metadata databases ready, which as you know is key for so many of the planned outcomes of the project (basically everything that that requires newspapers). You can follow the progress and discussions on slack (newspapers_wp6) or on github (https://github.com/Living-with-machines/lib_metadata_db). Also, for reference, I’m attaching a screenshot of the latest version of the database diagram. As you can see, we are adding information from Mitchell’s newspaper press directories and geographical information into the newspaper metadata database as well, which will allow us to filter or query newspaper articles by, for example, political leaning or newspapers published in certain count(r)ies. But we need some help with the linking. It’s not a lot, so we thought it was better to do it manually. If you have some time, could you go to this spreadsheet: https://docs.google.com/spreadsheets/d/1ZIzkhf_9bGqQTAvcnQXlHkctj8kPJZQYnsvyLkbh9bQ/edit#gid=1840345781 There are two tabs, but the most urgent is the first one, “Locations in metadata”, where we have collected the places of publication that we have extracted so far from our newspaper metadata (thanks Nilo). However, these places of publication are just strings, and are not really standardised. So we decided we would link them to Wikidata IDs. For example, “Aberdare, Mid Glamorgan, Wales” would be linked to “Q319369” (you can find the Wikidata ID using the Wikidata search box in https://www.wikidata.org/). It would be great if you had some time to help with linking these places to Wikidata IDs. Thank you!

Four hours later, finished.

@mcollardanuy to team:

Thank you all, I did not expect we’d have it completed so soon! Could anyone help with one more thing? Kalle has just given me a list of NLPs corresponding to newspapers for which we don’t always have place information in the metadata (see new tab in the spreadsheet “Unknown location”): https://docs.google.com/spreadsheets/d/1ZIzkhf_9bGqQTAvcnQXlHkctj8kPJZQYnsvyLkbh9bQ/edit#gid=1704548143 I have manually added the “Publication title” and “Publication place” columns, from the overview table in https://github.com/alan-turing-institute/Living-with-Machines/issues/2767, but could someone please double-check that that is all correct?

@npedrazzini checked, and resolved after 1.5h.

kallewesterling commented 2 years ago

@mcollardanuy Are there any bits of this that need handover? Or should we just consider this pretty much resolved?

mcollardanuy commented 2 years ago

Hi! The current admin county is still missing, but I don't know if the team still needs this in the metadata: this linking was paused because there were some discussions on what exactly to link. If the current admin county is not needed, then I think this issue could be closed, but @kmcdono2 will know more about it.

kallewesterling commented 2 years ago

I'll follow up with @kmcdono2! Thanks @mcollardanuy!