💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
Issue:
I've been working on a chatbot platform for my company (internal use) and we are using Rasa NLU. We have gained some traction internally, and one of our next use cases to tackle is a Frequently Asked Question bot.
The issue is they have over 1000 questions and answer pairs. How do I scale Rasa up correctly to handle such a large, diverse, dataset.
Scaling up in terms of what? You'd definitely have to go through those question/answers pairs and label them accordingly. I'd guess a lot of them overlap and so you'd end up with fewer than 1000 intents.
Rasa NLU version: 0.12.2
Operating system (windows, osx, ...): Windows Server 2012 R2
Content of model configuration file: Server YML: language: "en"
pipeline:
name: "nlp_spacy" name: "tokenizer_spacy" name: "intent_featurizer_spacy" name: "ner_crf" features: [["low", "title"], ["bias", "word3"], ["upper", "pos", "pos2"]] name: "ner_synonyms" name: "intent_classifier_sklearn" name: "intent_entity_featurizer_regex" name: "ner_duckling" dimensions: [ "time", "number", "duration"] Training YML: language: "en"
pipeline:
name: "nlp_spacy" name: "tokenizer_spacy" name: "intent_featurizer_spacy" name: "ner_crf" features: [["low", "title"], ["bias", "word3"], ["upper", "pos", "pos2"]] name: "ner_synonyms" name: "intent_classifier_sklearn" name: "intent_entity_featurizer_regex" name: "ner_duckling" dimensions: [ "time", "number", "duration"] data: {Data Goes Here}
Issue: I've been working on a chatbot platform for my company (internal use) and we are using Rasa NLU. We have gained some traction internally, and one of our next use cases to tackle is a Frequently Asked Question bot.
The issue is they have over 1000 questions and answer pairs. How do I scale Rasa up correctly to handle such a large, diverse, dataset.