htrc / torchlite-app

torchlite-app.vercel.app
0 stars 0 forks source link

Map widget specifications/details #49

Open jswatsch opened 1 year ago

jswatsch commented 1 year ago

This ticket is to more precisely define how the map widget will look and the data it will display.

Type of map: cluster map

We expect that for a large workset, many authors will have the same place of birth. For example, we might have 50 authors in a workset, all born in London. It's unlikely we have exact coordinates for where those authors were born, and instead we will be working with a generic geo coordinate for London.

For this reason, instead of mapping 50 individual points on the map (all of which would be on top of the other), it makes more sense to cluster them together. The size of the cluster would represent author density, e.g. the bigger the cluster the more authors in that workset were born in that same place.

Here are examples of that type of map:

Image

Image

Additionally, countries themselves can be color coded to represent author density as well. For example, if we have 100 authors from England vs. 10 authors from from France, England would be a darker shade than France.

Data specifications:

contributor id (VIAF link) This is the ID in the schema that can be used to connect to outside data sources like wikidata to pull in information like place of birth

contributor name Name of the authors of a particular volume

place of birth This data will come from Wikidata, using the VIAF as a link. Based on @dkudeki's work with this data, it is often in the form of coordinates for a specific city or Provence. @dkudeki can be a resource on what type of data is available from Wikidata

What happens if there is no author place of birth information available?

seniordev-ca commented 1 year ago
dkudeki commented 1 year ago

The example code can be found here: https://observablehq.com/d/e69a3c5185393caa. Specifically you want to look at the getCountryCounts function to see how to call Wikidata. There you can find the SPARQL query I used to get the country counts.

If you need a query that also gets you the longitude and latitude, try out this.

The function I used in D3 is looking for ISO codes, so that's what I'm querying for, but Wikidata has lots of country identifiers, so there's a good chance they'll have whatever you need. So don't feel the need to be tied to the ISO codes if there's something else you'd prefer to use.

I found that sending 50 VIAF ids in a single query was something of a sweet spot of not sending too many queries and making the queries small enough to respond quickly. I set up a slight random stagger in when I send out the queries to avoid hitting Wikidata all at once.

dkudeki commented 1 year ago

I also have a few example worksets in the observable notebook in the worksetids array that might be useful as dummy data. They are listed in order of increasing size, from I think 5 to ~5,000 volumes.

jswatsch commented 1 year ago

@seniordev-ca we discussed your questions. Please let us know if you have additional questions!

Yes you should include a time slider. The slider should limit or expand based on the author's birthdate. See dummy data response for info about the date field for the data.

You can use the metadata dummy data that Cliff provided for the filtering/publication data combined with the wiki data for author's birth dates. @dkudeki is going to try to generate this data for you ASAP.

Yes the map should zoom in and out and leaflet does make sense to use if possible. Here is an example of a D3/leaflet viz: https://bost.ocks.org/mike/leaflet/

dkudeki commented 1 year ago

Just a note, I've updated the query to Wikidata to look like this. I've added date of birth information and made all three pieces of information we want optional, so for example, we can still get coordinate data, even if there is no ISO code to retrieve (like for the case where the place of birth is listed not as a city, but as a historical country). Also, I'm asking for whatever Wikidata considers the "best" value for place or date of birth to try to get around multiple value cases, but not everything has a "best" value. So cases with multiple values for a single entry need to be accounted for. Also this complicates the query some, so its possible 50 VIAF IDs per query is no longer the sweet spot for responsive queries. I'll need to run some testing on this updated query.

seniordev-ca commented 1 year ago

@jswatsch , where do I get the data to show on the map widget? I confirmed with @cwulfman that we have API endpoints for the map widget to get the data. All data (contributor id (VIAF link), contributor name, place of birth) will come from this Torchlite API endpoint? Or do I need to fetch the wiki data with Viaf link from front-end?

image
seniordev-ca commented 1 year ago

Just a note, I've updated the query to Wikidata to look like this. I've added date of birth information and made all three pieces of information we want optional, so for example, we can still get coordinate data, even if there is no ISO code to retrieve (like for the case where the place of birth is listed not as a city, but as a historical country). Also, I'm asking for whatever Wikidata considers the "best" value for place or date of birth to try to get around multiple value cases, but not everything has a "best" value. So cases with multiple values for a single entry need to be accounted for. Also this complicates the query some, so its possible 50 VIAF IDs per query is no longer the sweet spot for responsive queries. I'll need to run some testing on this updated query.

@dkudeki , how can I show the map widget based on City / Province not Country?

image
dkudeki commented 1 year ago

The dummy data I sent includes coordinate data for each VIAF ID returned. I have not implemented adding city data, but looking at tutorials like this makes it look to me like the projection object being used in the Choropleth function can also be used to translate the coordinate points onto the correct place on the map.

rdubnic2 commented 1 year ago

@seniordev-ca Just a note that Janet is out on family leave until August, so feel free to direct these questions to Deren or Boris directly, or to me if you're unsure who to contact. I am filling in as Interim OES director while Janet is away. Thank you!

seniordev-ca commented 1 year ago

@dkudeki , actually I don't think we can get the geoData for all cities.

so here is my idea, show the Choropleth map at a country level as it is now, and we show markers/bubbles at the city level.

What I need is then to get City names/coordinates from VIAF link or coordinate for contributors.

image

What do you think? @borice @dkudeki