tessel / tessel-av

USB Camera, Microphone, MP3 Player and Text Speaker support for Tessel 2.
38 stars 9 forks source link

text-to-speech? #6

Closed dudleyjosh closed 8 years ago

dudleyjosh commented 8 years ago

Any possibility of supporting text-to-speech with npm modules such as say?

rwaldron commented 8 years ago

Once I have Microphone support, you'll be able to pipe the input stream to any processing you want

dudleyjosh commented 8 years ago

I notice on on the npmjs page for say it requires festival to be installed on Linux to run... I poked around and it seems there is no such opkg package... will this be another hurdle? Sorry for all of the questions, I am just trying to learn along with trying to figure out how to accomplish my end goal. Thanks for all of the help thus far.

rwaldron commented 8 years ago

I don't know why I said "Microphone" earlier... I had it backwards.

I just installed espeak on my T2 and it "just worked", so that's pretty awesome. I think a "feature" upgrade for this module would be to allow creating a non-file based instance of Speaker that allows passing arbitrary strings to playback.

rwaldron commented 8 years ago

But seriously, this is fun as hell. https://twitter.com/rwaldron/status/709848175922823169

dudleyjosh commented 8 years ago

I built a few custom wifi PA speakers for a manufacturing environment. Right now I am using pre-loaded mp3 files to make timer based announcements for one of our processes but it would be awesome if I could just send it a text string and have it speak accordingly. This makes it super easy to share a common speaker with many computers or have one computer use many speakers.

fullsizerender

rwaldron commented 8 years ago

I can work on this tomorrow :)

dudleyjosh commented 8 years ago

I really appreciate it... thanks.

rwaldron commented 8 years ago

@io2work I'm pretty pleased with how this came out. I was even able to figure out a way to enforce installation of binary deps via opkg. The first time you run a program that uses Player or Speaker* the program will attempt to make sure everything is installed, then it's all good after that—I didn't get a chance to enforce network connection, so please make sure your Tessel 2 is connected to an outside network the first time you test a Player or Speaker program. Ok, here's the good stuff—I've had a lot of fun with this today:

var os = require("os");
var av = require("tessel-av");

var alphabet = "abcdefghijklmnopqrstuvwxyz".split("");
var speaker = new av.Speaker();

speaker.say(`
  Hello, this is ${os.hostname()}. I'm going to say my A-B-C's now
`);

alphabet.forEach(letter => speaker.say(letter));

speaker.on("lastword", function() {
  this.say("And now I know my A-B-C's");
});

Here it is: https://www.youtube.com/watch?v=NsHsrAUs2CY

* the API formerly at Speaker is now called Player and Speaker is a text-to-speech API—old code will still work though because I care :D

dudleyjosh commented 8 years ago

@rwaldron Man, I really appreciate this... I am going to roll my speakers out to production tomorrow with this implemented. I just tested it and it worked perfectly :)

rwaldron commented 8 years ago

Yesssss!

dudleyjosh commented 8 years ago

Been using it all day!!!

I have a suggestion for the API that would make it a little more consistent between speaker and player.

Instead of...

var sound = new av.Player(mp3);
sound.play();

maybe...

var player = new av.Player();
player.play(mp3);

I will have 4 of my setups deployed by the end of next week... thanks to your efforts!

rwaldron commented 8 years ago

I also spent some time thinking about that both yesterday and today, here's the rationale that I came up with:

However... since I'm now not the only one that's thought about this, I will revisit.

rwaldron commented 8 years ago

Also...

Been using it all day!!!

AMAZING :D

I would love a video or something that catches it in action

rwaldron commented 8 years ago

player.play(mp3);

Well... I just hit a snag, probably still something we can do, but play already accepts a time code to start from.

I'll keep thinking about this.

dudleyjosh commented 8 years ago

I would love a video or something that catches it in action

I was just waiting until I have it properly mounted... I will certainly take a video of it in production and share it with you :)

Me and the other engineers had way too much fun with it today sending it "inappropriate" text strings. By the way, the queue feature rocks... I was planning to create that capability myself but it worked just as I needed it to from go :)

dudleyjosh commented 8 years ago

Here is rev1 code to run it...

var os = require('os');
var path = require('path');

var getmac = require('getmac');
var macAddress = '';
getmac.getMac(function(err, value) {
    macAddress = value;
});

var ip = require('ip');
var ipAddress = ip.address();

var port = 8080;

var router = require('tiny-router');
var av = require('tessel-av');

var speaker = new av.Speaker();
var wordsPerMinute = 100;

router
    .get('/', function(req, res) {   // this is my lazy way of reminding me of the API
        res.send({
            description: 'Simple WiFi Speaker API',
            hostname: os.hostname(),
            url: os.hostname().toLowerCase() + '.[your web addres].com',
            ip: ipAddress,
            port: port,
            macAddress: macAddress,
            API: {
                play: {
                    route: '/play/{options}',
                    options: {
                        path: 'path to file (notifications/)',
                        filename: 'file.mp3 (with or without the .mp3 extension)'
                    }
                },
                notifications: {
                    route: '/notifications/{filename}',
                    filename: 'with or without .mp3 extension'
                },
                speak: {
                    route: '/speak/{text}',
                    text: 'text string to be spoken'
                },
                abc: {
                    route: '/abc'
                }
            }
        });
    })
    .get('/play/{options}', function(req, res){

        var options = JSON.parse(decodeURIComponent(req.body.options));

        if ( options.filename.indexOf('.mp3') == -1 ) {
            options.filename += '.mp3';
        }

        var mp3 = path.join(__dirname, options.path + options.filename);
        //console.log('pathMP3', mp3);

        var player = new av.Player(mp3);

        player.on('end', function() {
            res.send({"pathMP3": mp3});
        });

        player.play();

    })
    .post('/play/file/', function(req, res) {
        //handle POST here
    })
    .get('/notifications/{filename}', function(req, res){

        var filename = decodeURIComponent(req.body.filename);

        if ( filename.indexOf('.mp3') == -1 ) {
            filename += '.mp3';
        }

        var mp3 = path.join(__dirname, 'notifications/' + filename);
        //console.log('pathMP3', mp3);

        var player = new av.Player(mp3);

        player.on('end', function() {
            res.send({"pathMP3": mp3});
        });

        player.play();

    })
    .get('/speak/{text}', function(req, res) {

        var text = decodeURIComponent(req.body.text);
        //console.log(text);

        speaker.say({
            phrase: text,
            s: wordsPerMinute
        });

        res.send(text);

    })
    .get('/abc', function(req, res) {

        speaker.say("Hello, this is " + os.hostname() + ". I'm going to say my A-B-C's now.");

        var alphabet = 'abcdefghijklmnopqrstuvwxyz'.split('');
        alphabet.forEach(letter => speaker.say(letter));

        speaker.on('lastword', function() {
            this.say('And now I know my A-B-Cs');
            res.send('Finished with A-B-Cs');
        });

    });

router.listen(port);
console.log('router listening at ip: ' + ipAddress + ' port: ' + port);

speaker.say({
    phrase: 'Hello, ' + os.hostname() + ' is now online!',
    s: wordsPerMinute
});
rwaldron commented 8 years ago

I like how you included the "ABCs" :D

rwaldron commented 8 years ago

@io2work I just released some updates that meet the needs you described above. Can you try it all out? The readme has all the info on how to use the new features

rwaldron commented 8 years ago

Of course, you can also just ask me anything directly :)

dudleyjosh commented 8 years ago

I've already tested it and deployed it to all three speakers! Thanks!!!

I should have them mounted in the labs this week and I will get you a video of them working in a live production environment once they are mounted.

rwaldron commented 8 years ago

Do the new capabilities improve the program in the way you'd hoped? Just trying to make sure the APIs are as "ideal" as they can be :)

dudleyjosh commented 8 years ago

I feel like I'm just being picky at this point but the one thing I see that would be an improvement is if you broke up the phrase/file and the options when calling say/play.

Instead of...

speaker.say({
  phrase: 'Hello!',
  a: 10,
  r: 50,
});

player.play({
  file: 'foo.mp3',
  a: 10,
  p: 2,
});

maybe this...

speaker.say(phrase, {options});

player.play(file, {options});
dudleyjosh commented 8 years ago

FYI, I changed my username from @io2work to @dudleyjosh so it would be consistent between here, slack and elsewhere.

rwaldron commented 8 years ago

Thanks for the heads up :)

rwaldron commented 8 years ago

I feel like I'm just being picky at this point but the one thing I see that would be an improvement is if you broke up the phrase/file and the options when calling say/play.

Let me think about this for a bit ;)

dudleyjosh commented 8 years ago

I've got a couple of more thoughts about the API.

  1. Can the Player() have the same queue functionality as Speaker()?
  2. speaker.on('lastword') doesn't seem to be working (at least not consistently)? I was able to achieve the same thing with speaker.once('empty') since it is set with each http request. In fact, I discovered that all of my events should have been set with .once instead of .on since I was setting them with each http request.
dudleyjosh commented 8 years ago

Here are a couple of videos of the system in action...

TimerBoard with audio in background

Speaker

dudleyjosh commented 8 years ago

I think I finally arrived at my default eSpeak settings.

var eSpeakOptions = {
        amplitude: 120, // eSpeak default = 100 (0 to 200)
        pitch: 50, // eSpeak default = 50 (0 to 99)
        speed: 120, // eSpeak default = 175 (80 to 450)
        voice: 'en+f2'
};
rwaldron commented 8 years ago

These videos are AMAZING!

Are we ready to close this?

dudleyjosh commented 8 years ago

Yeah, this can certainly be closed.

rwaldron commented 8 years ago

Thanks again for filing this and for your patience as we worked through it :)