Closed acka47 closed 4 years ago
The process
$ curl --header "Accept: application/x-jsonlines" "http://lobid.org/gnd/search?q=dateOfDeath%3A%5B*+TO+1949-12-31%5D" | jq -r .gndIdentifier > gndPersonsDeadBefore1949.txt
@dr0i übernehmen Sie. The list of hbz IDs comprises 556,223 titles and is at https://github.com/acka47/scripts/blob/master/gemeinfrei.txt.
I just noticed made an error in the script as I only pulled 15 titles (default) per person. However, I noticed that we actually don't need a separate index for this as the death date of a contributing person is included in lobid-resources. Thus, I can easily get the list of titles with contributors who have died >70 years ago and can also filter out online resources (i.e. this that already have been scanned):
And this is the query that also filters out those resources that have another contributor that has not been yet for >= 70 years: contribution.agent.dateOfDeath:[* TO 1949] AND NOT contribution.agent.dateOfDeath:[1950 TO *] AND NOT medium.id:"http://rdaregistry.info/termList/RDACarrierType/1018"
@dr0i, please check out whether we can make a rewrite rule that adds this filter.
The rewriteRule could be added and it works, but up to date only partially: http://publicdomain.lobid.org/resources/search?q=hut Most annoying: facets not working. This is mandatory to work, I assume?
facets not working. This is mandatory to work, I assume?
Yes, I suggested to just use lobid and write a blog post on how to search for public domain resources held by a specific institution. I think that should be sufficient.
The blog post was published today, see http://blog.lobid.org/2020/04/27/gemeinfreie-titel-finden.html. I am also in contact with UB Wuppertal that apparently is most interested in this and will help out when they have questions. Closing this issue.
For libraries to check for out-of-copyright material to be scanned & published on the web. Very similar to #411.