HeardLibrary / vandycite

0 stars 0 forks source link

Figure out how to capture photograph data for 3D objects and what to do with it #19

Closed baskaufs closed 2 years ago

baskaufs commented 2 years ago

We probably need to hold off on uploading 3D artworks until several problems are addressed.

One problem is that the web scraping missed the second table with information about the photograph. Here is an example. This is important in order to get the information necessary to acknowledge the photographer if CC BY (even if the artwork itself is Public Domain).

One complication is how to model this in Wikidata. Do we create two items (one for the 3D artwork and one for the photograph of the artwork)? Or do we only create the item for the 3D artwork and just handle the photo in Commons? Here is an example:

image

If we create two items, which one do we link the ACT ID property to? The property constraints say that it can only be assigned to one item.

baskaufs commented 2 years ago

The Commons-scraping script misses metadata for 3D items

See an example at https://commons.wikimedia.org/wiki/File:Apostles_Christ_ivory_Louvre_OA3850.jpg

and https://commons.wikimedia.org/wiki/File:Statue_bourgeois_calais_rodin.jpg

I think the problem is that there are two tables: one for the artwork and one for the photo. I think the script only searches the first one.

The Commons-scraping script misses some of the 3D links to items and I'm not sure why. This needs to be figured out to prevent duplication.

baskaufs commented 2 years ago

Related to https://github.com/HeardLibrary/vandycite/issues/45

baskaufs commented 2 years ago

This is the key information to answer this question: https://commons.wikimedia.org/wiki/Commons:Structured_data/Modeling/Visual_artworks See also the comments starting here: https://github.com/HeardLibrary/vandycite/issues/42#issuecomment-1188312420

baskaufs commented 2 years ago

Here is a good example to use as a template: https://commons.wikimedia.org/wiki/File:1479_Stein_der_f%C3%BCnften_Sonne,_sog._Aztekenkalender,_Ollin_Tonatiuh_anagoria.JPG Note that in the file source: https://commons.wikimedia.org/w/index.php?title=File:1479_Stein_der_f%C3%BCnften_Sonne,_sog._Aztekenkalender,_Ollin_Tonatiuh_anagoria.JPG&action=edit the only information in the file description section is the {{Art Photo}} template. The rest of the data come from the structured data.

In this case, there isn't a P6243 "digital representation of artwork" statement, so apparently just the P180 and P921 statements plus Art Photo template are enough to cause the metadata to get picked up from Wikidata and the structured data. The structured data populates the Photograph table and Wikidata populates the Object table.

baskaufs commented 2 years ago

See also https://commons.wikimedia.org/wiki/Commons:Structured_data/Modeling/Author for information about how to refer to Commons image creators of 3D works

baskaufs commented 2 years ago

Modified Commonsbot script to support 3D images in https://github.com/HeardLibrary/linked-data/commit/1a554cab9e707c1a609a8542f32889c96d6b3177

Still work needed to support multiple images per 3D work.

baskaufs commented 2 years ago

There is still probably some work to do on the scraping side for works already in Commons, but I think I know how to deal with this now, so I'm closing it.