Talk

Talk asset art

Let's build a conversational engine so we can talk to our computers! Demo with audio

Is this project useful to you? Give me a ⬆money upvote!⬆

Supported platforms

Right now, we have been testing this on linux + cuda. The project is still at an early stage, and requires a lot of elbow grease to get running. We'll keep on making it better as time goes on!

Changelog

Wed Jun 21 2023

Talk now uses an event based architecture
Set up still isn't straightforward. We'll give this a pass Wed Jun 14 2023
Talk now responds to you.
Breaking change: You're going to have to add piper to your path. See the manual steps

Goals

Runs completely locally
Usuable by my grandmother, if she spoke english
Simple to extend
Discover little HCI hacks
Being able to learn something while driving
Clean up the LLaMa node cpp binding I added in my forked submodule enough to merge into mainline

Installation

The intended audience for this project at the current state is people who are comfortable with hacking things together.

Using bundled bash script (experimental)

chmod 775 build.sh
./build.sh

If you would like to install piper automatically: (this downloads the piper binaries and the default TTS model)

source install_piper.sh true $([ -n "$BASH" ] && echo 1 || echo 2)

WARNING: The bash script will move the existing config.json file to config.json.bkp and create a new one instead.

Dependencies

Node.js v14.15+
piper, a TTS engine. Make sure to add it to your path. This means calling piper from anywhere in your system should work.
graphviz (optional) for displaying the event graph. This is useful for development.

Using manual steps

npm install
Clone the submodules: git submodule init && git submodule update --recursive
Run npm install in whisper.cpp/examples/addon.node
Build & run them (make sure that whisper.cpp & llama.cpp can run)
- cd whisper.cpp && make
- cd llama.cpp && make
In the whisper.cpp git submodule, run:
- npx cmake-js compile --CDWHISPER_CUBLAS="ON" -T whisper-addon -B Release
Note that the above command has --CDWHISPER_CUBLAS=ON. Change that depending on the build parameters you want for your whisper engine. cmake-js can take cmake flags using --CD{The flag you want}. I'm using CUBLAS=ON because I'm on a 3090. Drop it if you're on a macbook.
mv build/Release/* ../bindings/whisper/
Get weights for the next step! I'm using hermes-13b for LLaMa, and whisper tiny.
In llama.cpp git submodule, build and run the server. Check out their README here for steps on how to do that. LLama should be running on local host port 8080 (We'll clean this up and make it easier to run)
Make sure you can run their example curl!
Change config.json to point to the models you downloaded

Running the whole package

Change the config.json to point to record_audio.sh to listen from mic or sample_audio.sh for bundled audio examples
If record_audio.sh is selected, make sure sox package is install in your system. You can install it apt install sox libsox-fmt-all
Read the code! Figure out which button you'll have to press to initiate the response reflex and have the bot respond
(In another shell) ./llama.cpp/build_server/bin/server -m models/llama/nous-hermes-13b.ggmlv3.q4_K_S.bin -c 2048
npm run start

Display graphs

A graphviz file talk.dot will be created when you press ctrl-C.

You can view the graph by running npm run graph, which will plot an svg and open it.

Contributing

Please do

The bindings suck! How do I make them do what i want?

vim ./${llama/whisper}/examples/addon.node/addon.cpp

yacineMTB / talk

readme