lfoppiano / grobid-superconductors

Grobid module for superconductor material and properties extraction
Apache License 2.0
18 stars 2 forks source link

Tc classification breaking use case #53

Open lfoppiano opened 2 years ago

lfoppiano commented 2 years ago

Apparently this document (document2.pdf) + superconductors using scibert makes a risotto with the tc classification:

Jul 11 12:47:07 falcon docker[11065]: Traceback (most recent call last):
Jul 11 12:47:07 falcon docker[11065]: File "/opt/service/venv/lib/python3.7/site-packages/bottle.py", line 870, in _handle
Jul 11 12:47:07 falcon docker[11065]: return route.call(**args)
Jul 11 12:47:07 falcon docker[11065]: File "/opt/service/venv/lib/python3.7/site-packages/bottle.py", line 1750, in wrapper
Jul 11 12:47:07 falcon docker[11065]: rv = callback(*a, **ka)
Jul 11 12:47:07 falcon docker[11065]: File "/opt/service/grobid_superconductors/service.py", line 118, in process_link
Jul 11 12:47:07 falcon docker[11065]: result.append(self.process_single_sentence(sentence_input, 
link_types_as_list, skip_classification))
Jul 11 12:47:07 falcon docker[11065]: File "/opt/service/grobid_superconductors/service.py", line 143, in process_single_sentence
Jul 11 12:47:07 falcon docker[11065]: marked_tc_paragraph = self.temperature_classifier.mark_temperatures_paragraph(paragraph_input)
Jul 11 12:47:07 falcon docker[11065]: File "/opt/service/grobid_superconductors/linking/linking_module.py", line 561, in mark_temperatures_paragraph
Jul 11 12:47:07 falcon docker[11065]: return self.mark_temperatures(text_, tokens_, spans_)
Jul 11 12:47:07 falcon docker[11065]: File "/opt/service/grobid_superconductors/linking/linking_module.py", line 543, in mark_temperatures
Jul 11 12:47:07 falcon docker[11065]: doc = self.init_doc(words, spaces, spans_remapped)
Jul 11 12:47:07 falcon docker[11065]: File "/opt/service/grobid_superconductors/linking/linking_module.py", line 68, in init_doc
Jul 11 12:47:07 falcon docker[11065]: span = Span(doc=doc, start=s['token_start'], end=s['token_end'], label=s['type'])
Jul 11 12:47:07 falcon docker[11065]: File "spacy/tokens/span.pyx", line 99, in spacy.tokens.span.Span.__cinit__
Jul 11 12:47:07 falcon docker[11065]: IndexError: [E035] Error creating span with start 9 and end 6 for Doc of length 24.