Language information and associated vernacular titles are useful information for people working in NLP and looking to extract documents in other languages than English. I added this feature to the code and wrote associated description and tests.
Here is the fine-grained description of the changes:
I added the extraction of Language XML elements from PubMed/MEDLINE citations. Defaults to empty string whenever non-available.
Whenever available (i.e. when the language is not english), I added the extraction of the VenacularTitle element. Defaults to empty string whenever non-available.
I updated the README to reflect these changes.
I updated the medline_parser test to reflect these changes. All tests pass.
Language information and associated vernacular titles are useful information for people working in NLP and looking to extract documents in other languages than English. I added this feature to the code and wrote associated description and tests.
Here is the fine-grained description of the changes: