NationalLibraryOfNorway / meteor

A python module and REST API for automatic extraction of metadata from PDF files
Apache License 2.0
11 stars 2 forks source link

TT-1042: Read language codes for initialized files from .env #9

Closed fredrikmonsen closed 1 year ago

fredrikmonsen commented 1 year ago

This commit makes the application read which language codes to read from in the metadata_extract.data.txt json files.

Additionally, this commit changes language code to ISO-692-2, and separates 'nob' and 'nno'. Furthermore, the Norwegian Nynorsk word lists have been expanded.

Undetermined language code 'und' is added with universally used keywords such as 'issn' and 'isbn'