Closed visvamba closed 4 years ago
Hey @visvamba! Thanks for your question.
Chatette is meant to generate training data (so the nlu.md
or nlu.json
files in Rasa) from templates, not the other way around. Generally, you'll have to write either your templates on your own (or use someone else's), or write the training data yourself directly.
There would actually be a way to automatically turn a training data file into a template file, but without extracting the structure behind the examples, which wouldn't make much sense. For example, given the following nlu.md
:
## intent:ask-food
- I want [food](food)
- I want a little bit of [fish](food)
- I would like [fish](food)
- I would like some [food](food)
you could automatically create this template file:
%[ask-food](4)
I want @[food]
I want a little bit of @[food]
I would like @[food]
I would like some @[food]
@[food]
fish
food
which I assume is not what you want. Given the same training data, the following template would make much more sense, but is not easy to automatically generate -- if possible at all:
%[ask-food](4)
I want ~[some] @[food]
I would like ~[some] @[food]
~[some]
some
a little bit of
@[food]
food
fish
This is obviously a very simple example, but I guess you see the point: making a template which contains each and every example that's in your training data doesn't make much sense.
If needed though, it shouldn't be too hard to make a small bash script that turns your nlu.md
file into a "dumb" template file as I just showed you, if you really need this.
If you're interested, this problem of extracting structure (i.e. templates) from a list of example is actually still an active research subject.
All that being said, could you please tell me why you need to do the reverse process that Chatette does?
Cheers.
Thanks for your explanation. I can see why the reverse process would be better done by hand. I asked because I have a moderately-sized training data set already written for my Rasa model in Markdown, and I think the Chatette format is a lot cleaner and easier to organise.
Is it possible to convert my data/nlu.md or nlu.json into a Chatette file? The base file option only extracts the regex and lookup from Rasa files, from what I can tell.