Open mpogue2 opened 1 year ago
Nice idea. Keep in mind that we are calling to music usually. That means a lot of noise for the speech recognition system that might influence the results badly. When you are planning to use an iPad with the build in microphone the room will add reverb and delays as well that the system has do deal with. Recognition results will be much better if you get a direct connection/by pass to the callers microphone. Not to mention callers with strange accents ('squirrel thru').
It's not, however, that hard to get a feed of just the mic vocals out of most sound systems. Hilton has a mic-only output, and most callers I know who use non-Hilton setups have a mixing channel just for vocals, so that they can feed hearing enhancement systems, if nothing else.
@Gero5 I am pretty familiar with the challenges around speech recognition and word spotting. I worked for 10 years on the Alexa series of products from Amazon, which make extensive use of both. :-) :-)
In my ideal world, I think I'd have my iPad listening for calls and automatically checking them off, while I am running SquareDesk on my MacBook. I sight call at the SSD and Plus levels, and it would be super helpful to maintain that list while I am calling...
Right now I do have a checklist that runs in a browser on my iPad (located here: https://zenstarstudio.com/helper/pwa1.13/#/check, and source code located here: https://github.com/mpogue2/helper), if you ever want to try it out.
@danlyke I currently use a Yamaha MG10XU for a mixer, and I currently output L and R with an external balanced mixer cable, so that stereo music inputs (e.g. from the round dance cuer, or from another caller's Windows laptop running SquareView -- both are stereo outputs) are mixed down to two mono balanced outputs that go to two places: 1) to the QSC K8 (over a balanced line), and 2) to the Williams hearing assist transmitter (which needs both music and voice). So, essentially I don't currently have a way to get a voice-only channel out of the mixer (I suppose I could make one, with a different output cable, and forcing guest callers/cuers to use mono mode. Can you think of a better way?
I need to open up my case and play with my little Yamaha MG06 (I think), but it has XLR and 1/4" jacks for the stereo out, and a headphones 1/4". I use the dual inputs into the K8 speaker as my second-stage mixer, so the Yamaha is entirely voice, and the headphone jack for the hearing impaired transmitter.
A quick search suggests that at least one person is using both the XLR and 1/4" outputs of the MG10XU, with the issue that gain into the back of the K8 would be the same as whatever was sent to the computer, but since there's an additional adjustable gain stage on the K8, there may be a way to work around that.
@mpogue2 I do not know your mixer. From the pictures it looks like the Yamaha MG10XU has an AUX channel. Is possible to put your voice only on that channel and use the AUX output for your voice recognition purposes? At my mixer that channel is independent from the mixed output AFAIK.
@Gero5 The AUX OUT is replaced by FX OUT on this mixer.
@danlyke Maybe there's a way to do what you suggest, and use the external balanced mixer cable to mix for Williams Transmitter and K8 power speaker (mixes L and R), while taking just the L channel via the L phone jack out (with the balance control for my mic channel set entirely to L). I could then route that to my iPad somehow (via mic in, perhaps?).
I really want a way for the computer to keep track of which calls I haven't used, so that I don't have to do it manually from memory. Square Desk Tip Planner has a page, for example, that allows me to "check off" calls on the Plus list (and some from the MS list) that I've used. But, this takes time during the break, and I have to remember what I called.
I would like a "word spotter" (this is a technical term in the Automated Speech Recognition community) that spots calls, and checks them off the list automatically when I've used them. Optionally, removes them from the list, and just shows the calls that I have NOT used yet. Or maybe it just moves them to the bottom of the list.
That way, I can try to meet my goal of calling a wide variety of Plus (and MS) calls each dance. For example, I keep forgetting to call dopaso, and so dancers are probably very rusty on that one.
It could also keep stats on how many of each call I tend to use, to help flag calls that other callers use more often than I do (Tag the Line, I'm lookin' at YOU).
This might be something that belongs in the Tip Planner on an iPad (where I could see it more easily). I dunno. I just wanted to record the idea here. I think word spotter technology is good enough and iPads powerful enough now that it could be done in real time on an iPad with something like Whisper.