davanstrien / IIIF-ML-experiments

1 stars 1 forks source link

LCTGM Wikidata test set #15

Open glenrobson opened 3 years ago

glenrobson commented 3 years ago

Issue to investigate Wikidata as a test data source. There are 697 LCTGM terms on wikidata (query) and these are present on subject objects e.g. screenplay, orangery, love etc.

These subjects are then linked to using the Depicts relation to relate images to subjects.

glenrobson commented 3 years ago

Limiting the query to items with subject LCTGM headings that have a image link (query) we get the following 29 headings:

Subject Count
portrait 10709
chair 762
hug 388
river 289
tree 225
military uniform 182
bridge 157
sport 155
poet 114
railway 113
award 91
school 65
dam 63
beach 44
fortification 29
stained glass 29
column 23
bride 18
bridegroom 18
agricultural worker 18
monastery 17
pew 17
tower 16
balustrade 14
advertising 13
cargo ship 12
Serliana 11
parking lot 11
stream 11

Query to retrieve example images

glenrobson commented 3 years ago

Limiting this to IIIF images in Wikidata (query) we get 16 results:

Subject Count
sport 155
river 146
poet 102
award 91
tree 86
bridge 75
dam 53
stained glass 29
school 26
bride 18
agricultural worker 18
bridegroom 18
beach 17
pew 17
chair 11
railway 11

Query to retrieve example images

glenrobson commented 3 years ago

Questions:

glenrobson commented 3 years ago

Here is the query to get all of the Image URLs for items that can be linked to LCTGM:

https://w.wiki/37xy

Note in the summaries above I limited it to those that have over 10 images per subject but the query above doesn't have this limit so quite a few will have just a single image linked.