Create dataset loader for thai-tnhc2-books

Dataset	thai_tnhc2_books
Description	This dataset collects all 353 books from the Thai National Historical Corpus 2 (TNHC2) corpus. The dataset has been cleaned to use text for pretraining models and NLP tasks. The TNHC2 corpus is a Thai old books corpus and all books are copyright expired according to Thai law (50 years after the author's death). More information on this corpus can be found here: https://www.arts.chula.ac.th/chulaseal/tnhc2/.
Subsets	-
Languages	tha
Tasks	Language Modeling
License	Creative Commons Zero v1.0 Universal (cc0-1.0)
Homepage	https://www.arts.chula.ac.th/chulaseal/tnhc2/
HF URL	https://huggingface.co/datasets/pythainlp/thai-tnhc2-books
Paper URL	-

SEACrowd / seacrowd-datahub