bigscience-workshop / lam

Libraries, Archives and Museums (LAM)
Apache License 2.0
81 stars 7 forks source link

Add dataset: french_fiction_16_18th_century #86

Open davanstrien opened 2 years ago

davanstrien commented 2 years ago

A URL for this dataset

https://zenodo.org/record/5770866

Dataset description

A corpus containing all digitized French novels from the beginning of print (the first entry is from 1473) to the 18th century.

French novels of the period have been identified using the Y2 quote of the French National Library Catalog that has served to classify past and present collections of novels in France from 1730 to 1996. Combined use of digitized sources from Gallica, Google Books, Archive.org and other digital library made it possible to attain a high representativeness: 78% of the novels of the 1450-1600 and 68% of the novels of the 1600-1700 have been retrieved.

Dataset modality

Text

Dataset licence

Creative Commons Attribution 4.0 International

Other licence

No response

How can you access this data

As a download from a repository/website

size of dataset

500MB-2GB

Confirm the dataset has an open licence

Contact details for data custodian

No response