Open machenmusik opened 6 years ago
It looks like it uses WebRTC
.
That would make this blocked on https://github.com/webmixedreality/exokit/issues/218.
Looks to me as though it polyfills from old to new gUM API, and uses microsoft-speech-browser-sdk
which uses getUserMedia
not all of WebRTC...
The SDK depends on WebRTC APIs to get access to the microphone and read the audio stream. Most of todays browsers(Edge/Chrome/Firefox) support this. For more details about supported browsers refer to navigator.getUserMedia#BrowserCompatibility
Note: The SDK currently depends on navigator.getUserMedia API. However this API is in process of being dropped as browsers are moving towards newer MediaDevices.getUserMedia instead. The SDK will add support to the newer API soon.
How does ExoKit currently expose the microphone, is there a simple sample page?
I thought it was providing navigator.mediaDevices.getUserMedia
for audio, but not for video...
Exokit exposes navigator.mediaDevices.getUserMedia
, for video and audio, and this is not related to WebRTC
.
WebRTC is on the roadmap, but it's separate. You can put a media stream into WebRTC
, but that's not the same thing. That Microsoft speech service appears to require WebRTC
, so it won't work until https://github.com/webmixedreality/exokit/pull/346 is merged.
The home environment has microphone support: https://github.com/webmixedreality/exokit-home/blob/7f1247b45cbf98d370b0a6b3228617c7a3b0ef9f/index.html#L381
And there are video/webcam examples in this repo as well: https://github.com/webmixedreality/exokit/blob/master/examples/webcam.html
It would be nice to get a microphone example in this repo though.
I guess I am not understanding where you think the underlying Microsoft SDK requires WebRTC, my understanding is that it sends audio via standard websocket; and to get the microphone, my understanding is that it is doing something similar to your xrmp, meaning getUserMedia then Web Audio API:
Is there a better way to debug more precisely what is failing inside ExoKit?
I should mention / ask - does ExoKit prevent use of getUserMedia() unless over https, or does it support localhost exception?
I should mention / ask - does ExoKit prevent use of getUserMedia() unless over https, or does it support localhost exception?
There are no checks. http
or even file
will have all features.
Is there a better way to debug more precisely what is failing inside ExoKit?
The way I would do it is console.log()
to see which part -- if anything -- is not meeting the spec. If there was a stack trace or a repro I could maybe help more.
If the intent is to get the speech polyfill working (without WebRTC -- I can't speak to how it's implemented, it just mentions WebRTC in the readme), then I would probably be logging events inside the speech api lib to bisect what's apparently not working.
(by the way, this would be an awesome contribution if we got it running; having the speech api is probably something we want in the core 💪)
agree :-) is there no way to set breakpoints or use remote inspector somehow? that is how I would (did) do with other browsers...
It's just node
, you can use any node debug tool -- such as Chrome Devtools, which supports connecting to node.
You could also try ndb support.
when I try to have ExoKit allow inspector:
09-15 23:46:58.807 1968 6 I exokit : Inspector support is not available with this Node.js build
09-15 23:46:58.807 1968 6 I exokit : node: bad option: --inspect
09-15 23:46:58.807 1968 6 I exokit : node: bad option: --debug-brk```
Looks like ML1 device? I was assuming with the above it was desktop 😅.
In that case, that is probably a legit bug, the libnode
is not built with debugger support. Build repo.
It could be built with debugger, but I think it was disabled for some silly reason.
ignoring the debugger, I see:
09-15 23:53:15.055 2174 6 I exokit : RtApiDummy: This class provides no functionality.
09-15 23:53:15.055 2174 6 I exokit :
09-15 23:53:15.055 2174 6 I exokit :
09-15 23:53:15.055 2174 6 I exokit : RtApi::openStream: output device parameter value is invalid.
so perhaps the microphone isn't hooked up either?
Yeah, it's not hooked up in Magic Leap (it is on desktop). Same for Audio.
Should probably open tracking issues for these.
With ML1 (not tested on other platforms) using 0.0.485 or 0.0.486, things are close to working; incremental results are generated, but recognition.stop() is called when a final result is made available, and never returns.
See #537
0.0.494 still has the issue, need to reopen.
Still seems broken in 0.0.516 ?!?
@machenmusik could you provide more details?
Sure - speech-polyfill.azurewebsites.net doesn't work, no interim or final transcriptions are seen
That is not an XR site, so it's not expected to boot into anything. However I suspect it's trying to use WebRTC audio channels, which are not hooked in. Previously we were going over WebSocket if I recall.
It uses getUserMedia to get the microphone, I think, and not sure that is working as expected. It should work in 2D mode, right? You do see the Just talk prompt appear in 2D...
Yeah the GUM part should be fine, but the part of connecting that to RTC would not be since MediaStream is not hooked to RTC yet.
I think previously there was a fallback to feed that GUM to WebSockets, which was ok, but if it sees RTC it might not be going that route anymore.
Just curious, would the other direction work, which I only have a test of with form elements? In the test, type in something & click say
. I just unplugged my router power supply after page load, and it worked on both Firefox & chrome, so I am pretty sure it is not using a server.
Is there an equivalent of chrome:flags to try turning off webrtc? If not what is latest non-rtc build to try?
No, there is not; I believe the last without WebRTC was two releases ago.
Ok thx - will try older build a bit later.
no luck; exokit 0,0.514 through 0.0.516 share the same behavior, namely (as 2D tab) you see "Just talk" and then no interim or final results ever appear. So appears not to be WebRTC related?
In that case this is most likely a Lumin initialization order refactor. @machenmusik if you're doing custom builds, might be worth bisecting to the commit that broke it. I suspect that has a good chance of making it clear what the fix would be.
to be fair, I think that making it work always required some changes (since 0.0.484) and I'm not sure the branch was merged. due to changes required to get older builds running on newer LuminOS, I'm not sure bisection is practical, but I can probably point you at the older changes needed; what I don't recall is whether specific LabSound changes were also required, which would of course complicate things.
We had everything working in master
, including the necessary Labsound updates. If you could point to a changeset that worked and one that's broken, we can narrow down what broke it.
This is what I am currently using https://github.com/chenzlabs/exokit/commits/mic-hack-perceptionstartup which is 0.0.494 plus one patch that was needed at the time, plus manifest update and the Perception startup change now needed.
Note that version predates 2D support and reality tabs,
so you would only see things working via mldb log exokit
The oldest release version that renders on current LuminOS is 0.0.504 - and the good news is, if you look at mldb logs, 0.0.504 works with speech-polyfill.azurewebsites.net! However when used with our experience (which I'm sorry we cannot share) liveview is broken, and trying to touch it breaks mic and ultimately all audio.
I will see what the newest release version that works with speech-polyfill.azurewebsites.net is, and hopefully you can help to take it from there... stay tuned.
0.0.508 has reality tabs, and when tried in 2D, speech-polyfill.azurewebsites.net throws this...
02-22 16:32:15.898 2111 5 I exokit : parent got console { jsString:
02-22 16:32:15.898 2111 5 I exokit : '\'Unhandled callback error: Error: \'Unhandled callback error: \'Unhandled callback error: Error occurred processing the user media stream. Error: getUserMedia is not implemented in this browser\'\'. InnerError: \'Unhandled callback error: \'Unhandled callback error: Error occurred processing the user media stream. Error: getUserMedia is not implemented in this browser\'\'\'',
02-22 16:32:15.898 2111 5 I exokit : scriptUrl: '',
02-22 16:32:15.898 2111 5 I exokit : startLine: 0 }
Note that 0.0.506 works although using mldb log exokit
since no reality tabs, and 0.0.507 has no corresponding MPK, so hopefully that narrows it down (it may have been broken once reality tabs were introduced)
wait, I may have to take it back,
0.0.508 does work in 3D mode using mldb log exokit
to see
Summary: No versions work in 2D tab; 3D tab works the same as direct mldb URL launch. The last release version that works from mldb launch is 0.0.511; 0.0.512 is broken.
I think we should clarify the 2D/3D divide.
The "2D" mode for reality tabs has almost nothing to do with Exokit -- Exokit forks out to a 2D rendering engine (Servo or Chromium) to draw the page to a texture. It is only intended to draw a texture to the 3D scene, and not intended to work with any media APIs -- which it does not.
Additionally, reality tabs are themselves not released so nothing should be tested against that environment yet. Luckily reality tabs is just an HTML page that happens to load by default -- loading another page at the top level (instead of realitytabs.html
) should work on all releases, including current ones, via the command line/mldb.
Hopefully that clarifies things.
https://speech-polyfill.azurewebsites.net/
I understand that ExoKit currently doesn't implement the Web Speech API, so tried using a polyfill that uses getUserMedia() to get the audio, which works on Firefox - https://github.com/anteloe/speech-polyfill - but that doesn't seem to work with ExoKit.