OctoConsulting / octobot

Octo's artificial intelligence chatbot
https://octoconsulting.github.io/octobot/webapp/
Apache License 2.0
3 stars 0 forks source link

Research Tensorflow, Microsoft Azure, and Google Cloud Services #8

Open juliandduque opened 6 years ago

juliandduque commented 6 years ago

Find the following for each:

Level of complexity Compatibility with node.js Ability to create chat bots What format our knowledge bases need to be in

davidbzhao commented 6 years ago

Notes: Let's not try to compete with Google's or Microsoft's machine learning teams. I don't think we'll win.

Tensorflow

TF is just a Google-developed library mainly used for machine learning.

Complexity

The syntax itself is not hard to pick up, but understanding the underlying concepts and how to design ML algorithms "correctly" does not come with learning TF. A great, basic course in machine learning is Andrew Ng's coursera course. That's where I started ML.

Chat bots

[Summary: Data is harder to get than writing the chatbot code] I don't think writing Tensorflow-specific code for a chat bot is all that difficult. My first thoughts are just use a RNN, or its extension, the LSTM on user input (and past user inputs if we want contextual answers, though that's 100% a reach goal). However, the hard part with using TF and machine learning in general is good, clean data. Especially for deep learning algorithms such as LSTMs, you need lots of data to really train your network. Given a list of question/answer pairs, we'd probably need to find a way to programmatically expand our data, using some thesaurus API for example.

Data format

[Summary: question answer pairs is fine] We can still use question-answer pairs. Answers can be referenced as one-hot vectors. Questions can be preprocessed in whatever manner and then again ran against a dictionary to generate one-hot vectors for each word.

Node.JS

[Summary: Compatible] Google's Cloud Machine Learning Engine can run any tensorflow code we want and we can query it for predictions.

Code could be trained on an AWS EC2 instance with a GPU. That's the good thing about tensorflow is that it can be accelerated with a GPU. Temporarily, I can also train it on my laptop; I've got a 960M. From there, the trained model can be pickled and zipped into a lambda function, which can be called by Node.JS.

Google Cloud Services

Essentially, AWS but Google.

Complexity

[Summary: easy peasy] GCP, like AWS, has products for storage and serverless computing, but also Google has a lot of machine learning and natural language processing APIs that we can utilize. All are well-maintained and well-documented for easy use.

Chat bots

There are two main ways we could integrate GCP into a chat bot.

Data format

We define the intents. Oh gosh, it's all just so pretty.

Node.JS

Here's the Node.JS SDK

Microsoft Azure

A complete framework that can use various coding languages to create chat bots. They can be interfaced into websites [embedded javascript generated by their tools], or even bots that can be created into various chat applications (group.me, slack...).

Complexity

It is all web based, can be created using node.js. There is a user interface to use the tools and it seems to be very user friendly. It also has a ton of support.

Chat bots

The service we used at RamHacks, QnAMaker, utilized the Bot Framework, Cognitive Services, and Azure Bot Service.

Data Format

Explicitly has to be question and answers. Azure does intent classification for us when the user inputs a question.

Node.JS

Here is the Node.JS SDK.

Useful links

Introduction to Azure

Building an intelligent chatbot with Azure