bigscience-workshop / lam

Libraries, Archives and Museums (LAM)
Apache License 2.0
81 stars 7 forks source link

Add dataset: nls_chapbooks_illustrations #59

Closed davanstrien closed 2 years ago

davanstrien commented 2 years ago

A URL for this dataset

https://gitlab.com/vgg/nls-chapbooks-illustrations

Dataset description

This proposed dataset consists of a few components:

This is an excellent dataset for training or evaluating object detection models on historical material. Identifying visual content in digitised material is very useful for both LAM institutions and researchers. In addition, the dataset is also relatively large compared to many LAM object detection training datasets, which are sometimes only large enough to perform evaluation and not large enough for training (particularly without transfer learning).

Dataset modality

Image

Dataset licence

Creative Commons Public Domain Dedication and Certification

Other licence

No response

How can you access this data

As a download from a repository/website

Confirm the dataset has an open licence

Contact details for data custodian

No response

davanstrien commented 2 years ago

Initial data loading script here: https://huggingface.co/datasets/biglam/nls_chapbook_illustrations. Needs some tidying still. In contact with authors/NLS about working on a datacard.

davanstrien commented 2 years ago

Closing this one for now