JpEncausse / SARAH-Client-Windows

14 stars 16 forks source link

S.A.R.A.H.

S.A.R.A.H. is an OpenSource client/server framework to control Internet of Things using Voice, Gesture, Face, QRCode recognition. It is heavily bound to Kinect v1 and v2 SDK.

This project contains C# Client for SARAH. And will communicate with NodeJS Server for SARAH.

License

            DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
                    Version 2, December 2004

 Copyright (C) 2012 S.A.R.A.H. <sarah.project@encausse.net>

 Everyone is permitted to copy and distribute verbatim or modified
 copies of this license document, and changing it is allowed as long
 as the name is changed.

            DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

  0. You just DO WHAT THE FUCK YOU WANT TO.
 This program is free software. It comes without any warranty, to
 the extent permitted by applicable law. You can redistribute it
 and/or modify it under the terms of the Do What The Fuck You Want
 To Public License, Version 2, as published by S.A.R.A.H. See
 http://www.wtfpl.net/ for more details.

Be Warned: some dependencies may require fees for commercial uses. Kinect XBox 360 can ONLY be used for development purpose

Description

SARAH v4.0 Client is built on an AddOns Architecture sharing common properties. Each AddOn:

Client's Addons are stored in AddOns folder and have no relation with SARAH plugins. The idea is to sandbox features and library into subprojects.

This page describe Client's Core/AddOns specification. SARAH developers should see :

AddOns: Blob

UNDER CONSTRUCTION

This AddOn use EmguCV to Detect and Draw blobs in camera frame according to color, shape, features, ...

AddOns: Body Engine

SARAH provides NBody: an abstract representation of Body / Skeleton implemented by Kinect 1, Kinect 2, and other computer vision tools.

This AddOn draw body parts, joints and relevant areas.

AddOns: Debug

This AddOn implements logging features to output relevant data about SARAH:

Properties

; Path to log file
log-file=AddOns/debug/${shortdate}.log

; Path to recognized command
dump=AddOns/debug/dump/${date}.wav

; Path to audio response
voice=AddOns/debug/voice/${date}.wav"

AddOns: Face Recognition

This AddOn use EmguCV to perform Face Detection and Face Recognition. In a Nutshell:

Properties

Choose face detection and recognition algorithms. See also CodeProject article.

; Cascade for Face detection
HaarCascade=AddOns/face/haarcascades/haarcascade_frontalface_default.xml

; Algorithm confidence
Eigen_Threshold=0
Fisher_Threshold=0
LBP_Threshold=30

AddOns: Google Speech

This AddOn use GoogleSpeech API to perform Speech2Text on speech recognitions.

Since 2014 Google narrow it API v2 to 50 req/day ! An API Key MUST be provided to the plugin (See G+ tutorial and explanations).

AddOns: HTTP Engine

This plugin handle HTTP communication with SARAH Server. It provides to other AddOns the ability to send GET/POST/FILE request with a custom token.

Properties

[http.remote]

; Remote address of NodeJS server
server=http://127.0.0.1:8080

[http.local]

; Local HTTP server port
port=8888

temp=AddOns/http/temp/

FIXME

AddOns: IP Camera

UNDER CONSTRUCTION

This AddOn use EmguCV or nVLC to process frames of IP Camera

AddOns: Kinect v1

This AddOn use Kinect SDK v1.8 to start all connected Kinects v1 XBox360 or Windows.

The motion detection stops as much as possible stream when there is no motion. Properties allow control of streams: Audio stream only or custom fps.

FIXME

AddOns: Kinect v2

This AddOn use Kinect SDK v2.0 to start one connected Kinects v2 for Windows.

The motion detection stops as much as possible stream when there is no motion. Properties allow audio stream only.

FIXME

AddOns: Microphone

This AddOn use Microphone to start a SpeechEngine with given confidence.

Properties

; Confidence of recognition (from 0 to 1)
confidence=0.7

; Define the audio device index (0 is the default device, see logs)
device=0

FIXME

AddOns: OS Manager

This AddOn handle all OS specific actions

AddOns: Pitch Tracker

This AddOn use the speech audio stream to compute voice pitch and update ProfileManagers.

AddOns: Profile Engine

This AddOn manage profile for people playing with SARAH.

FIXME

AddOns: Prominent Color

This AddOn compute color frame of each device to draw most prominent color.

FIXME

AddOns: QRCode

UNDER CONSTRUCTION

AddOns: RTP Client

This AddOn start an RTP Client on a given Thread to listen to inbound stream

Can be called on RaspberryPi using ffmpeg

ffmpeg -ac 1 -f alsa -i hw:1,0 -ar 16000 -acodec pcm_s16le -f rtp rtp://192.168.0.8:7887
avconv -f alsa -ac 1 -i hw:0,0 -acodec mp2 -b 64k -f rtp rtp://{IP of your laptop}:1234

Or using Kinect on windows

ffmpeg  -f dshow  -i audio="Réseau de microphones (Kinect U"   -ar 16000 -acodec pcm_s16le -f rtp rtp://127.0.0.1:7887

Properties

; Confidence of recognition (from 0 to 1)
confidence=0.6

; Define the RTP Port
port=7887

AddOns: Speaker Engine

This AddOn manage speaker to play audio stream:

Properties

; The device index to play on (use -1 for default device)
device=-1

; Volume level of device
volume=50

; Timeout in seconds to stop the player
timeout=480

FIXME

AddOns: Speech Engine

This AddOn build a Speech Engine for each provided audio stream (Kinect, Microphone, ...).

FIXME

AddOns: Voice Engine

This AddOn perform text to speech according to given actions:

FIXME

AddOns: WebSocket

UNDER CONSTRUCTION

AddOns: Window Manager

This AddOn provide a window (and a menu item) to display other AddOns data.