I noticed that .get_entity_spans sometimes fails (on the contrary, .sample works well). It has something to do with trailing spaces.
Example 1:
from genre.fairseq_model import GENRE
from genre.entity_linking import get_end_to_end_prefix_allowed_tokens_fn_fairseq as get_prefix_allowed_tokens_fn
from genre.utils import get_entity_spans_fairseq as get_entity_spans
model = GENRE.from_pretrained("../models/fairseq_e2e_entity_linking_aidayago").eval()
sentences = ['This transition consists of moving from an energy system mainly based on fossil fuels to an energy system based on low-carbon sources, especially renewable ones.']
get_entity_spans(
model,
sentences)
Output error (no "text" in the list):
output_sentences = get_entity_spans_post_processing(
[e[0]["text"] for e in output_sentences]
)
IndexError: list index out of range
If I modify this line and don't add trailing spaces, the example 1 works.
Example 2 (less critical):
# a sentence with a trailing space at the end
sentences = ['The legal environment is adapting to keep up with the evolution of technologies and our societies (increased use of digital technology, growth of online commerce, etc.). ']
get_entity_spans(
model,
sentences)
Output error (sent is a list):
in get_entity_spans_post_processing
sent = re.sub(r"{.*?", "{ ", sent)
File "/usr/lib/python3.8/re.py", line 210, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object
this example produces a different error. If the trailing space is removed from the end of the sentence, the example works.
I use the fairseq version as described in the readme.
get_entity_spans was not used for the experiments in the paper. I wrote that function to show how to use .sample. I did use trailing spaces for the end-to-end experiments as well as for training.
Hello,
thank you very much for the tool!
I noticed that
.get_entity_spans
sometimes fails (on the contrary,.sample
works well). It has something to do with trailing spaces.Example 1:
Output error (no "text" in the list):
If I modify this line and don't add trailing spaces, the example 1 works.
Example 2 (less critical):
Output error (
sent
is a list):this example produces a different error. If the trailing space is removed from the end of the sentence, the example works.
I use the fairseq version as described in the readme.