Renamed output directory from output_xmls to output_teis for clarity and consistency.
Updated grobid_service.py to save processed PDFs to the new output directory.
Added a new module tei_to_json.py for converting TEI XML documents to JSON format. This module includes functions for recursive XML parsing and saving the JSON output.
Expanded development dependencies in dev.txt to include lxml.
output_xmls
tooutput_teis
for clarity and consistency.grobid_service.py
to save processed PDFs to the new output directory.tei_to_json.py
for converting TEI XML documents to JSON format. This module includes functions for recursive XML parsing and saving the JSON output.dev.txt
to includelxml
.