synesthesiam / rhasspy

Rhasspy voice assistant for offline home automation
https://rhasspy.readthedocs.io
MIT License
942 stars 101 forks source link

Problem with training data created for rasa from sentences.ini #187

Closed paddsen closed 4 years ago

paddsen commented 4 years ago

I am having problems with creation of rasa train data from sentences.ini since last change to intent_train.py (fa24588). Only the first entity will get used by rasa, which is most likely because the nlu.md only has examples with the first entity in it.

Example intent from my sentences.ini (simplified): [ChangeItemState] item_name = (wohnzimmerlampe | garagenlicht | wandlicht | deckenlicht) item_state = (ein | an | aus | aktiviere | deaktiviere) item_where = $where_type schalte (die | das) <item_name>{ChangeItemState_item} [(in|im)] [(der | dem )] <item_where>{ChangeItemState_where} <item_state>{ChangeItemState_state}

paddsen commented 4 years ago

Resulting nlu.md will be: ## intent:ChangeItemState - schalte die [deckenlampe](ChangeItemState_item) in dem kinderzimmer ein - schalte die [deckenlampe](ChangeItemState_item) in dem kinderzimmer an - schalte die [deckenlampe](ChangeItemState_item) in dem kinderzimmer aus

Slots/entity for ChangeItemState_state and ChangeItemState_where are missing.

paddsen commented 4 years ago

Ok, I am pretty sure it has to do with the increment of the "raw_index" in intent_train.py. If I insert an decrement of "raw_index" by one just in case the "token" is an entity, my resulting markdown looks ok: - schalte das [deckenlicht](ChangeItemState_item) im [kinderzimmer](ChangeItemState_where) [aus](ChangeItemState_state)

paddsen commented 4 years ago

Here is what I've added to "intent_train.py": if entity: # Add to current entity entity_tokens.append(token) raw_index += -1

Now it works perfectly with complex sentences an even for multiple slots of the same type. I am not 100% sure if that's the right if-statement or if it has to be inside of the "if new_entity" statement for some boundary conditions.

synesthesiam commented 4 years ago

Hi @paddsen, thanks for the feedback. This has been fixed in master by copying the token variable before it gets modified later on.

paddsen commented 4 years ago

Wonderful! Thanks!