Open victorconan opened 2 years ago
Hi @victorconan Thank you for reporting this issue, I will investigate your request in the coming days and will get back to you. I am sure the patent number is in the document and could be extracted
Hi @victorconan Thank you for reporting this issue, I will investigate your request in the coming days and will get back to you. I am sure the patent number is in the document and could be extracted
I looked at the xml file, and it seems they used doc-number
and didn't distinguish whether it is patent number or something else :/ But the tag section puts publication-reference
and application-reference
in it:
<us-bibliographic-data-grant>
<publication-reference>
<document-id>
<country>US</country>
<doc-number>D0939807</doc-number>
<kind>S1</kind>
<date>20220104</date>
</document-id>
</publication-reference>
<application-reference appl-type="design">
<document-id>
<country>US</country>
<doc-number>29667332</doc-number>
<date>20181019</date>
</document-id>
</application-reference>
My guess is D0939807
from publication-reference
is a patent number with extra 0
after D
(not sure why, it is weird). And 29667332
from application-reference
is application number. I think the parser only parses the latter one?
I think the bug is here:
def get_patent_identification_data(root_tree):
publication_info = root_tree.find(publication_info_base_path)
application_info = root_tree.find(application_info_base_path)
term_of_grant_info = root_tree.find(us_term_of_grant_path)
term_of_grant_length = root_tree.find(us_term_of_grant_length)
term_of_grant_extension = root_tree.find(us_term_of_grant_extension)
us_term_of_grant_disclaimer = root_tree.find(us_term_of_grant_disclaimer_text)
invention_title = root_tree.find(invention_title_path)
document_data = {}
if publication_info != None:
publication_reference_info = {element.tag: element.text for element in list(publication_info)}
document_data = {**document_data,**publication_reference_info}
if application_info !=None:
application_reference_info = {element.tag: element.text for element in list(application_info)}
if application_info.attrib and application_info.attrib['appl-type']:
application_reference_info['application_type'] = application_info.attrib['appl-type']
document_data = {**document_data,**application_reference_info}
Here if a patent has application info, then the publication info will be overwritten.
Hi all, sorry for jumping into the conversation. maybe a workaround on this is to rely on google patents api in order to convert the doc-number to a patent number. Cheers
I noticed the parser returned the
doc-number
rather thanpatent number
for the patents. Although one can search a patent usingdoc-number
, I cannot find a mapping fordoc-number
vs.patent number
. Do you know how to get the patent number? Thanks!