When parsing the bibliographical information, we just insert the keys
invention_title = root_tree.find(invention_title_path)
document_data = {}
if publication_info != None:
publication_reference_info = {element.tag: element.text for element in list(publication_info)}
document_data = {**document_data,**publication_reference_info}
if application_info !=None:
application_reference_info = {element.tag: element.text for element in list(application_info)}
if application_info.attrib and application_info.attrib['appl-type']:
application_reference_info['application_type'] = application_info.attrib['appl-type']
document_data = {**document_data,**application_reference_info}
When parsing the bibliographical information, we just insert the keys
source
An example patent might look like this (xml4)
The resulting dictionary lacks the patent id now, containing only the application id: