Open BelenSantamaria opened 2 years ago
This feels related to https://github.com/RasaHQ/rasa-3.x-component-examples/pull/5. Will dive in.
This was an issue with documentation, super sorry! This was my bad.
The docs still had the old version listed from Rasa 2.0. The new version no longer uses lookups but uses a path
variable instead to point to a file. I just pushed a new version of the docs, could you confirm if the issues persists?
Hi, I just tried it and it keeps giving me the same error, I include the configuration file with the changes:
recipe: default.v1
language: es
pipeline:
- name: WhitespaceTokenizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: char_wb
min_ngram: 1
max_ngram: 4
- name: rasa_nlu_examples.extractors.FlashTextEntityExtractor
case_sensitive: False
path: data/countries.txt
entity_name: pais
- name: DIETClassifier
epochs: 100
constrain_similarities: true
- name: EntitySynonymMapper
- name: ResponseSelector
epochs: 100
constrain_similarities: true
- name: FallbackClassifier
threshold: 0.2
ambiguity_threshold: 0.01
policies:
- name: MemoizationPolicy
- name: RulePolicy
- name: TEDPolicy
max_history: 5
epochs: 100
constrain_similarities: true
I also attach my countries.txt file
Could you give the full traceback including the commands that you ran before the error appeared?
This is the rasa train and rasa shell output
Could you give the traceback when starting rasa shell nlu --debug
? This should provide more details about the error
I attach the output as a file because it is very long output.txt
Thanks, the relevant part is:
'run_rasa_nlu_examples.extractors.FlashTextEntityExtractor5' loading 'FlashTextEntityExtractor.lo
ad' and kwargs: '{}'.
Traceback (most recent call last):
File "C:\Users\johndoe\miniconda3\envs\company\lib\site-packages\rasa\engine\graph.py", line 393, in _load_component
2022-01-27 16:33:42 self._component: GraphComponent = constructor( # type: ignore[no-redef]
DEBUG File "C:\Users\johndoe\miniconda3\envs\company\lib\site-packages\rasa\engine\graph.py", line 220, in load
urllib3.connectionpool - Starting new HTTPS connection (1): o251570.ingest.sentry.io:443
return cls.create(config, model_storage, resource, execution_context)
File "C:\Users\johndoe\miniconda3\envs\company\lib\site-packages\rasa_nlu_examples\extractors\flashtext_entity_extractor.py", line 85, in create
return cls(config, execution_context.node_name, model_storage, resource)
File "C:\Users\johndoe\miniconda3\envs\company\lib\site-packages\rasa_nlu_examples\extractors\flashtext_entity_extractor.py", line 66, in __init__
words = pathlib.Path(self.path).read_text().split("\n")
File "C:\Users\johndoe\miniconda3\envs\company\lib\pathlib.py", line 1237, in read_text
return f.read()
File "C:\Users\johndoe\miniconda3\envs\company\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in posprojecton 641: character maps to <undefined>
There seems to be an encoding error in the file you're trying to read in to your custom component. It's assuming cp1252, is that correct?
Based on the error you pointed out, I think the problem is in that the country names in the file have special characters and can't be read with this words = pathlib.Path(self.path).read_text().split("\n")
, I have executed it and the same error appeared.
If I execute words = pathlib.Path(r'..\data\countries.txt').read_text(encoding='utf-8').split("\n")
, it reads the file correctly.
Is it possible to add the encoding as an argument to the extractor?
Thank you! :)
That makes sense to me! Do you want to open a PR for it?
Hi, I am developing a bot with rasa and I wanted to include the component rasa_nlu_examples.extractors.FlashTextEntityExtractor.
I have added it to my configuration file which is as follows:
I have also added to my nlu file a lookup table with some countries, I add the start below as an example:
When using the command
rasa train
it trains a model and saves it in the models folder but when usingrasa shell
orrasa interactive
it gives me the following error:ERROR rasa.core.agent - Could not load model due to Error initializing graph component for node 'run_rasa_nlu_examples.extractors.FlashTextEntityExtractor5'..