RasaHQ / rasa

đź’¬ Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
https://rasa.com/docs/rasa/
Apache License 2.0
18.81k stars 4.62k forks source link

simple web app for creating training data #52

Closed amn41 closed 7 years ago

amn41 commented 7 years ago

I'm sure some users will want some kind of GUI for rasa NLU.

We currently have the visualizer which can render training data. Would be cool to have a simple web app which can create formatted training data files, similarly to the wit.ai / LUIS interfaces.

Just comment here if you would like to work on it, or have suggestions on how it should be done

dantodor commented 7 years ago

I would love to help, but my expertise is mainly in Scala/Java and Elixir on the server side ...

amn41 commented 7 years ago

thanks @dantodor - I think this will likely be a javascript app

dantodor commented 7 years ago

I would suggest then Elm :)

benbrown commented 7 years ago

I have 3/4 of this done in a Node app if that would be helpful! I built it as a wrapper to some Node-based NLP tools, but could easily be hooked to rasa!

amn41 commented 7 years ago

cool stuff @benbrown ! do you have a repo you can point to?

azazdeaz commented 7 years ago

Hi, i would be happy to do this one :)

I just started with NLP and chatbots a few months ago, so i'm not an expert but if you tell me what the app has to do i will implement it with nodejs and react.

If i see right we need a command line tool that launches a webapp on localhost which can edit the trainingdata.json-s and maybe run some commands.

You can reach me in email to fix the details if you like :)

amn41 commented 7 years ago

Hey @azazdeaz ! Sounds awesome đź‘Ť . We had in mind something similar to the wit.ai / LUIS Interfaces, where users can

  1. type in sentences to get labeled
  2. choose the intent that matches the sentence
  3. highlight particular words to define them as entities

This data should then get saved to a json file in the format of https://github.com/golastmile/rasa_nlu/blob/master/data/examples/rasa/demo-rasa.json

nmstoker commented 7 years ago

A proper web tool definitely looks like the way forward, but as a stop-gap measure, perhaps something along the lines of this simple spreadsheet would help people with quickly producing formatted training data?

https://docs.google.com/spreadsheets/d/1kZyi68cywCUdgE8P6L61u9SqnaXrKoUFi9F99ghP4Oc/edit?usp=sharing

I hope I've got the format okay (???) It should be accessible to everyone (I think you'll need to save a copy yourself to be able to edit that version)

There's a bit of an inconvenience currently when copying the JSON out to a text editor (it ends up surrounded with double-quotes and the double-quotes in the content get turned into double double-quotes, so this needs to be undone with some quick find-and-replace operations).

ToferC commented 7 years ago

Hey all,

I put together a simple script to create training json data either through a CSV import or via text entry on the command line. It works fine on its own, but would also be pretty simple to convert over to a django app or other web GUI. I'll create a pull request shortly.

ToferC commented 7 years ago

Question: I've written the script in Python 3 to take advantage of starred expressions in reading CSVs. Is this worthwhile putting up?

azazdeaz commented 7 years ago

Hi, i'm done with the basics :)

repo

online demo

For now, it's only editing the examples in the rasa_nlu_data.entity_examples. I don't know for sure how the json file should be structured, some docs would be very helpful :) (issue)

I'm not using rasa_nlu at the moment, so let me know if you have an idea to make this a better tool :)

nmstoker commented 7 years ago

Looks excellent. The entity selection of text works well, even on mobile (am using Chrome on Android)

On Thu, 5 Jan 2017, 01:06 Polgár András, notifications@github.com wrote:

Hi, i'm done with the basics :) repo https://github.com/azazdeaz/rasa-nlu-trainer online demo https://azazdeaz.github.io/rasa-nlu-trainer/

For now, it's only editing the examples in the rasa_nlu_data.entity_examples. I don't know for sure how json file should be structured, some docs would be very helpful :) (issue http://rasa_nlu_data.entity_examples)

I'm not using rasa_nlu at the moment, so let me know if you have an idea to make it a better tool :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/golastmile/rasa_nlu/issues/52#issuecomment-270534219, or mute the thread https://github.com/notifications/unsubscribe-auth/ADhflFUNIq6lw0ARDwNgEqDVOv6EwMvFks5rPEIlgaJpZM4LJBP3 .

ToferC commented 7 years ago

That looks awesome! You can grab/check JSON generation from my script here.

It sets things up in rasa nlu readable format.

https://github.com/ToferC/rasa_nlu/blob/master/generate_rasa_json.py

On Wed, Jan 4, 2017, 8:17 PM Neil Stoker notifications@github.com wrote:

Looks excellent. The entity selection of text works well, even on mobile (am using Chrome on Android)

On Thu, 5 Jan 2017, 01:06 Polgár András, notifications@github.com wrote:

Hi, i'm done with the basics :) repo https://github.com/azazdeaz/rasa-nlu-trainer online demo https://azazdeaz.github.io/rasa-nlu-trainer/

For now, it's only editing the examples in the rasa_nlu_data.entity_examples. I don't know for sure how json file should be structured, some docs would be very helpful :) (issue http://rasa_nlu_data.entity_examples)

I'm not using rasa_nlu at the moment, so let me know if you have an idea to make it a better tool :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <https://github.com/golastmile/rasa_nlu/issues/52#issuecomment-270534219 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ADhflFUNIq6lw0ARDwNgEqDVOv6EwMvFks5rPEIlgaJpZM4LJBP3

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/golastmile/rasa_nlu/issues/52#issuecomment-270536450, or mute the thread https://github.com/notifications/unsubscribe-auth/AF8DSuFltu7Ek_AAxD9VnNFOW6LFhR_8ks5rPESrgaJpZM4LJBP3 .

amn41 commented 7 years ago

this is awesome! going to look into it more closely

ToferC commented 7 years ago

@amn41 - In playing with generating rasa JSON, it occurred to me that adding fields for intents and entity types to the data files would be helpful for organizing projects and architecting work flows. If you're open to this, I can create a new issue and start some testing.

amn41 commented 7 years ago

@ToferC mean adding some metadata fields to the json? like a list of all the intents and entities which occur?

ToferC commented 7 years ago

@amn41 - yes. Would help with consistent labelling and in automating training data creation. For example, in my script, I ask for all intents & entities up front to speed labelling as text is entered. In a Web GUI, this info could be pulled into drop downs or auto-suggest fields.

The other solution is to create a set and just add to it as you go through.

azazdeaz commented 7 years ago

Collecting the different intent/entity types from the examples and using them in autocomplete fields is working well in the web app. But if the intents/entities are going to have more complex properties (like synonyms for entity values) then it would be very useful.

amn41 commented 7 years ago

so @azazdeaz 's awesome contribution means I think I can close this issue :) further discussions should happen at https://github.com/azazdeaz/rasa-nlu-trainer