JeremyARussell / Lucida

Speech and Vision Based Intelligent Personal Assistant
Other
4 stars 0 forks source link

Lucida

Lucida is a speech and vision based intelligent personal assistant inspired by Sirius. Visit our website for tutorial, and Lucida-users for help. The project is released under BSD license, except certain submodules contain their own specific licensing information. We would love to have your help on improving Lucida, and see CONTRIBUTING for more details.

Overview

Lucida Local Development

If you want to make contributions to Lucida, please build it locally:

Lucida Docker Deployment

If you want to use Lucida as a web application, please deploy using Docker and Kubernetes:

REST API for command center

The REST API is in active development and may change drastically. It currently supports only infer and learn. Other features may be added later. An example client for botframework is available. Information on how to use the API can be found in the wiki

Design Notes -- How to Add Your Own Service into Lucida?

Back-end Communication

Thrift is an RPC framework with the advantages of being efficient and language-neutral. It was originally developed by Facebook and now developed by both the open-source community (Apache Thrift) and Facebook. We use both Apache Thrift and Facebook Thrift because Facebook Thrift has a fully asynchronous C++ server but does not support Java very well. Also, Apache Thrift seems to be more popular. Therefore, we recommend using Apache Thrift for services written in Python and Java, and Facebook Thrift for services written in C++. However, you can choose either one for your own service as long as you follow the steps below.

One disadvantage about Thrift is that the interface has to be pre-defined and implemented by each service. If the interface changes, all services have to re-implement the interface. We try to avoid changing the interface by careful design, but if you really need to adapt the interface for your need, feel free to modify, but make sure that all services implement and use the new interface.

Detailed Instructions

You need to configure the command center (CMD) besides implementing the Thrift interface in order to add your own service into Lucida. Let's break it down into two steps:

1. Implement the Thrift interface jointly defined in lucida/lucidaservice.thrift and lucida/lucidatypes.thrift.

1. lucida/lucidaservice.thrift

  include "lucidatypes.thrift"
  service LucidaService {
      void create(1:string LUCID, 2:lucidatypes.QuerySpec spec);
      void learn(1:string LUCID, 2:lucidatypes.QuerySpec knowledge);
      string infer(1:string LUCID, 2:lucidatypes.QuerySpec query);
  }

The basic functionalities that your service needs to provide are called create, learn, and infer. They all take in the same type of parameters, a string representing the Lucida user ID (LUCID), and a QuerySpec defined in lucida/lucidatypes.thrift. The command center invokes these three procedures implemented by your service, and services can also invoke these procedures on each other to achieve communication. Thus the typical data flow looks like this:

Command Center (CMD) -> Your Own Service (YOS)

But it also can be like this:

Command Center (CMD) -> Your Own Service 0 (YOS0) -> Your Own Service 1 (YOS1) -> Your Own Service 2 (YOS2)

In this scenario, make sure to implement the asynchronous Thrift interface. If YOS0 implements the asynchronous Thrift interface, it won't block on waiting for the response from YOS1. If YOS0 implements the synchronous Thrift interface, it cannot make progress until YOS1 returns the response, so the operating system will perform a thread context switch, and let the current thread sleep until YOS1 returns. See section 3 of step 1 for implementation details.

create: create an intelligent instance based on supplied LUCID. It gives services a chance to warm up the pipeline, but our current services do not need that. Therefore, the command center does not send create request at this point. If your service needs to warm up for each user, make sure to modify the command center which is detailed in step 2.

learn: tell the intelligent instance to learn new knowledge based on data supplied in the query, which usually means the training process. Although it has be implemented, you can choose to do nothing in the function body if your service cannot learn new knowledge. For example, it may be hard to retrain a DNN model, so the facial recognition service simply prints a message when it receives a learn request. Otherwise, consider using a database system to store the new knowledge. Currently, we use MongoDB to store the text and image knowledge. You need to tell the command center whether to send a learn request to your service or not, which is detailed in step 2.

infer: ask the intelligence to infer using the data supplied in the query, which usually means the predicting process.

Notice all the three functions take in QuerySpec as their second parameters, so let's see what QuerySpec means for each function.

2. lucida/lucidatypes.thrift:

  struct QueryInput {
      1: string type;
      2: list<string> data;
      3: list<string> tags;
  }
  struct QuerySpec {
      1: string name;
      2: list<QueryInput> content;
  }

A QuerySpec has a name, which is create for create, knowledge for learn, and query for infer. A QuerySpec also has a list of QueryInput called content, which is the data payload. A QueryInput consists of a type, a list of data, and a list of tags.

3. Here are the code examples that you can use for your own service:

If it is written in C++, refer to the code in [lucida/imagematching/opencv_imm/server/] (lucida/imagematching/opencv_imm/server/). Look at Makefile for how to generate Thrift stubs which are the abstract base classes your handlers need to inherit. Notice that the interface is implemented in IMMHandler.h and IMMHandler.cpp, and the entry point (which uses a multi-threaded server provided by Thrift) is in IMMServer.cpp.

If it is written in Java, refer to the code in [lucida/calendar/src/main/java/calendar/] (lucida/calendar/src/main/java/calendar/) and lucida/calendar/. Look at Makefile for how to generate Thrift stubs which are the interfaces your handlers need to implement. Notice that the interface is implemented in CAServiceHandler.java, and the entry point (which uses a multi-threaded server provided by Thrift) is in CalendarDaemon.java.

If it is written in other programming languages, please refer to the official tutorial.

4. Here is a list of what you need to do for step 1:

2. Configure the command center.

lucida/commandcenter/controllers/Config.py is the only file you must modify, but you may also need to add sample queries to lucida/commandcenter/data/ as training data for the query classifier.

1. Modify the configuration file lucida/commandcenter/controllers/Config.py.

  SERVICES = { 
    'IMM' : Service('IMM', 8082, 'image', 'image'),  # image matching
    'QA' : Service('QA', 8083, 'text', 'text'), # question answering
    'CA' : Service('CA', 8084, 'text', None), # calendar
    }

  CLASSIFIER_DESCRIPTIONS = { 
    'text' : { 'class_QA' :  Graph([Node('QA')]),
                'class_CA' : Graph([Node('CA')]) },
    'image' : { 'class_IMM' : Graph([Node('IMM')]) },
    'text_image' : { 'class_QA': Graph([Node('QA')]),
                      'class_IMM' : Graph([Node('IMM')]), 
                      'class_IMM_QA' : Graph([Node('IMM', [1]), Node('QA')]) } 
    }

2. Add training data for your own query class.

We already prepare some sample training data in lucida/commandcenter/data/, but if you need to define a custom type of query that your service can handle, you should create the following file in the above directory:

  class_<NAME_OF_YOUR_QUERY_CLASS>.txt

, and have at least 40 pieces of text in it, each being one way to ask about the same question.