RasaHQ / rasalit

Visualizations and helpers to improve and debug machine learning models for Rasa Open Source
Apache License 2.0
305 stars 62 forks source link

Using live-nlu with models with Custom Components #28

Closed digitalWestie closed 3 years ago

digitalWestie commented 3 years ago

I'm using Rasa 2.1.2, with the following pipeline in config:

pipeline:
  - name: SpacyNLP
    model: "en_core_web_sm"
  - name: SpacyTokenizer
  - name: custom_components.SimpleNameExtractor
  - name: SpacyEntityExtractor
    dimensions: ["PERSON"] #https://spacy.io/api/annotation#section-named-entities
  - name: SpacyFeaturizer
    pooling: mean
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100

Running rasalit live-nlu doesn't work since it doesn't know where to find custom_components.SimpleNameExtractor. I get the following error:

ComponentNotFoundException: Failed to load the component 'custom_components.SimpleNameExtractor'. Failed to find module 'custom_components'. Either your pipeline configuration contains an error or the module you are trying to import is broken (e.g. the module is trying to import a package that is not installed). Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/rasa/nlu/registry.py", line 121, in get_component_class return rasa.shared.utils.common.class_from_module_path(component_name) File "/usr/local/lib/python3.6/dist-packages/rasa/shared/utils/common.py", line 19, in class_from_module_path m = importlib.import_module(module_name) File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 994, in _gcd_import File "<frozen importlib._bootstrap>", line 971, in _find_and_load File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked ModuleNotFoundError: No module named 'custom_components'

Is there a way right now for me to point rasalit at my config or my custom class?

koaning commented 3 years ago

Interesting!

From what paths are you running this? If your root has a file named custom_components.py that contains a class SimpleNameExtractorI think it should work. Could you confirm the paths? Just want to check that before I start investigating.

Also, and this is more of a curiosity, what does your SimpleNameExtractor component do?

digitalWestie commented 3 years ago

I'm running this in my project folder where I have a custom_components.py. The SimpleNameExtractor class has some unsophisticated code to extract names from an utterance e.g. 'My name is X' and so on.

Just to test, I copied custom_components.py into the rasalit lib folder , using: sudo cp custom_components.py /usr/local/lib/python3.6/dist-packages/rasalit/apps/livenlu/

This isn't a great solution, but it works for now! It would be good to know if there's another way I can point it.

digitalWestie commented 3 years ago

Now I'm getting this error in place: InvalidModelError: Model 'en_core_web_sm' is not a linked spaCy model. Please download and/or link a spaCy model, e.g. by running: python -m spacy download en_core_web_md python -m spacy link en_core_web_md en

any chance it's related? I've tried following the advice.

digitalWestie commented 3 years ago

Ah, I see it asks for en_core_web_sm and it recommends downloading and linking en_core_web_md instead. I've now gone for

python -m spacy download en_core_web_sm; python -m spacy link en_core_web_sm en

I've overcome that error, and it appears to have helped.

I can provide input to Text Input for Model - but I always get this now:

TypeError: 'NoneType' object is not callable
Traceback:
File "/usr/local/lib/python3.6/dist-packages/streamlit/script_runner.py", line 332, in _run_script
    exec(code, module.__dict__)
File "/usr/local/lib/python3.6/dist-packages/rasalit/apps/livenlu/app.py", line 32, in <module>
    interpreter=interpreter, text_input=text_input
File "/usr/local/lib/python3.6/dist-packages/rasalit/apps/livenlu/common.py", line 29, in fetch_info_from_message
    element.process(msg)
File "/usr/local/lib/python3.6/dist-packages/rasa/nlu/extractors/spacy_entity_extractor.py", line 33, in process
    doc = spacy_nlp(message.get(TEXT))
koaning commented 3 years ago

Could you share your spaCy version, just in case? I'm wondering if you're using spaCy-nightly, which we currently don't support yet.

koaning commented 3 years ago

@digitalWestie just to mention, we're collecting namelists for your name detection use-case over at rasa nlu examples. I'll also add a new component there that will be able to do string matching like you are, but in an optimised fashion using flashtext.

digitalWestie commented 3 years ago

Nope, pip says I'm using spaCy 2.3.4

Interesting you mention the names, we could easily put together a UK-centric list using the Babies First Names' datasets offered by National Records - https://www.nrscotland.gov.uk/statistics-and-data/statistics/statistics-by-theme/vital-events/names/babies-first-names

koaning commented 3 years ago

I'll try to look into this bug tomorrow, but in the meantime -> if you're interested in submitting a diverse name-list for the UK we'd welcome any PR 👍

koaning commented 3 years ago

Interesting. I'm also getting the same error.

File "/Users/vincent/Development/rasalit/venv/lib/python3.7/site-packages/streamlit/script_runner.py", line 324, in _run_script
    exec(code, module.__dict__)
File "/Users/vincent/Development/rasalit/rasalit/apps/livenlu/app.py", line 32, in <module>
    interpreter=interpreter, text_input=text_input
File "/Users/vincent/Development/rasalit/rasalit/apps/livenlu/common.py", line 29, in fetch_info_from_message
    element.process(msg)
File "/Users/vincent/Development/rasalit/venv/lib/python3.7/site-packages/rasa/nlu/extractors/spacy_entity_extractor.py", line 33, in process
    doc = spacy_nlp(message.get(TEXT))

Will investigate. Removing the SpacyEntityExtractor makes everything work again, but I'm curious to find out if there's a bug on Rasa's side or on the side of this repo.

koaning commented 3 years ago

It's on this side because I seem to be able to run;

> rasa shell nlu
hello my name is George

{
...
"entities": [
    {
      "entity": "PERSON",
      "value": "George",
      "start": 17,
      "confidence": null,
      "end": 23,
      "extractor": "SpacyEntityExtractor"
    }
  ]
}
koaning commented 3 years ago

Working on a solution https://github.com/RasaHQ/rasalit/pull/29

koaning commented 3 years ago

@digitalWestie I think I've merged a fix now. Let me know if it doesn't work and I'll gladly re-open the issue.

digitalWestie commented 3 years ago

Fantastic. I'll try it out and let you know.

digitalWestie commented 3 years ago

Working great! Thanks for this :+1: :1st_place_medal:

bshivkumar15 commented 3 years ago

@koaning I have used the latest build but get the same error when using Spacy Error.txt

koaning commented 3 years ago

@bshivkumar15 could you share the entire traceback? It indeed seems like something is happening over at the spaCy interface, but the actual error itself doesn't seem to be listed. Also, could you share your config.yml? Are you using a custom spaCy model here? Could you also share your spaCy version as well as the output of rasa --version?

bshivkumar15 commented 3 years ago

@koaning Sharing requested details

config.txt

Rasa Version : 2.0.2 Rasa SDK Version : 2.0.0 Rasa X Version : 0.34.0 Python Version : 3.8.5 Operating System : Linux-5.8.0-41-generic-x86_64-with-glibc2.29 Python Path : /home/shiva/Documents/Shiva/venv/bin/python3

============================== Info about spaCy ==============================

spaCy version 3.0.3
Location /home/shiva/Documents/Shiva/venv/lib/python3.8/site-packages/spacy Platform Linux-5.8.0-41-generic-x86_64-with-glibc2.29 Python version 3.8.5
Pipelines

koaning commented 3 years ago

Then I think I've found the problem. You're using spaCy version 3.0.3, which isn't supported by Rasa yet. There is a pull request open for it https://github.com/RasaHQ/rasa/pull/7869. For now, the simplest fix would be to explicitly add "en_core_web_md" explicitly as a spaCy model in the config. That should fix most issues.

Also, I'd still appreciate a full traceback just to confirm that this is the actual issue.

bshivkumar15 commented 3 years ago

@koaning Thanks a lot adding en_code_web_md resolved the issue.

bshivkumar15 commented 3 years ago

Rasa NLU Model Playground.txt Rasa Spelling Playground.txt I still end the "en" error when using other options

koaning commented 3 years ago

@bshivkumar15 I'm trying to understand what is going wrong here but I require more context. You sent me two files, what's the difference between the two? Also, could you share your config.yml?

bshivkumar15 commented 3 years ago

@koaning Apologies of being vague , the attached files are output when I run the spelling and live-nlu options. The same config file works for gridresults after adding the model: "en_core_web_sm" line

config-spacy.txt

koaning commented 3 years ago

Just to check something else then, since now I'm confused. Is this still an issue? You're using a spaCy version (3.0) that Rasa currently doesn't support but you seem to be able to fix it with a change to your config.yml. From my end, you're not describing an issue that is related to this library.

bshivkumar15 commented 3 years ago

@koaning

#issue -1

ModuleNotFoundError: No module named 'en' when running “gridresults”

issue status: RESOLVED after adding model: "en_core_web_sm" line in config.yml

#issue-2 / Issue #3

When using the modified config.yml (with model: "en_core_web_sm" line in config.yml)

Get ModuleNotFoundError: No module named 'en' when running

Hope this helps. if you are suggesting the issue #2/#3 is related to spacy version, I just wanted to understand if they can be fixed like #1 or is this a limitation.

koaning commented 3 years ago

Just to check, after modifying the config.yml did you also retrain a new model and are you using that model in rasalit?

koaning commented 3 years ago

Considering the feedback here https://github.com/RasaHQ/rasalit/issues/40 I think this issue is fixed. If not, let's start a new issue that is specific to your issue.