Open lintangsutawika opened 10 months ago
https://openstax.org
Probably very small in comparison to other datasets, but the textbooks are designated with Creative Commons Attribution 4.0
https://huggingface.co/datasets/crumb/openstax-text
This is only 75 textbooks and seems to need a substantial amount of processing. We can do it, but I would deprioritize it.
https://openstax.org
Probably very small in comparison to other datasets, but the textbooks are designated with Creative Commons Attribution 4.0