speechmatics / speechmatics-js-sdk

Javascript and Typescript SDK for Speechmatics
MIT License
39 stars 4 forks source link

Draft: Add microphone input demo #6

Closed TudorCRL closed 9 months ago

TudorCRL commented 10 months ago

Add microphone input demo. Includes a nextjs app showcasing how to use microphone inputs to send data to the Speechmatics WebSocket.

penx commented 9 months ago

Can I add prettier config to this repo?

penx commented 9 months ago

It seems npx run check from the root is not checking these files (which is what runs on CI).

Please could you format the files added/changed in this PR with rome? It would make it a lot easier for me to pick out my changes and make a PR.

Perhaps use the vscode addon if you aren't already (assuming you use vscode).

I think it would be worthwhile committing this to the repo, as this would have prevented my prettier formatter from kicking in my default:

.vscode/settings.json

{
  "[javascript]": {
    "editor.defaultFormatter": "rome.rome"
  },
  "[typescript]": {
    "editor.defaultFormatter": "rome.rome"
  },
  "[typescriptreact]": {
    "editor.defaultFormatter": "rome.rome"
  }
}
TudorCRL commented 9 months ago

It seems npx run check from the root is not checking these files (which is what runs on CI).

Please could you format the files added/changed in this PR with rome? It would make it a lot easier for me to pick out my changes and make a PR.

Perhaps use the vscode addon if you aren't already (assuming you use vscode).

I think it would be worthwhile committing this to the repo, as this would have prevented my prettier formatter from kicking in my default:

.vscode/settings.json

{
  "[javascript]": {
    "editor.defaultFormatter": "rome.rome"
  },
  "[typescript]": {
    "editor.defaultFormatter": "rome.rome"
  },
  "[typescriptreact]": {
    "editor.defaultFormatter": "rome.rome"
  }
}

I've run the formatter and added the .vscode changes now, so hopefully all the formatting issues should be resolved

penx commented 9 months ago

I had some thoughts around moving some code that could be reusable across demos in to their own hooks, but this is a tweak to code structure and doesn't need to hold up the example going in or the publishing of the blog post.

Something like this ```ts const useAudioRecorder = (jwt: string) => { const rtSessionRef = useRef(new RealtimeSession(jwt)); // sendAudio is used as a wrapper for the websocket to check the socket is finished init-ing before sending data const sendAudio = (data: Blob) => { if ( rtSessionRef.current.rtSocketHandler && rtSessionRef.current.isConnected ) { rtSessionRef.current.sendAudio(data); } }; return useMemo(() => [new AudioRecorder(sendAudio), rtSessionRef.current] as const, []); } const useTranscription = (jwt: string) => { const [transcription, setTranscription] = useState< RealtimeRecognitionResult[] >([]); const [sessionState, setSessionState] = useState('configure'); // Memoise AudioRecorder so it doesn't get recreated on re-render const [audioRecorder, rtSession] = useAudioRecorder(jwt); // Attach our event listeners to the realtime session rtSession.addListener('AddTranscript', (res) => { setTranscription([...transcription, ...res.results]); }); // start audio recording once the websocket is connected rtSession.addListener('RecognitionStarted', async () => { setSessionState('running'); }); rtSession.addListener('EndOfTranscript', async () => { setSessionState('configure'); await audioRecorder.stopRecording(); }); rtSession.addListener('Error', async () => { setSessionState('error'); await audioRecorder.stopRecording(); }); // Call the start method on click to start the websocket const startTranscription = async (deviceId: string) => { setSessionState('starting'); await audioRecorder .startRecording(deviceId) .then(async () => { setTranscription([]); await rtSession.start({ transcription_config: { max_delay: 2, language: 'en' }, audio_format: { type: 'file', }, }); }) .catch((err) => setSessionState('blocked')); }; // Stop the transcription on click to end the recording const stopTranscription = async () => { await audioRecorder.stopRecording(); await rtSession.stop(); }; return [transcription, sessionState, startTranscription, stopTranscription] as const } ```
penx commented 9 months ago

Also, I was thinking as this may serve as the best starting point for a node/nextjs project, it may be good to implement sessions with jwt token refresh in at some point (rather than getting a jwt token on every page load), but again not something we need now and perhaps it would only serve to complicate the example.

nickgerig commented 3 weeks ago

@Box333 can you start a discussion here: https://github.com/orgs/speechmatics/discussions instead? Or an issue in the JS SDK issues section here: https://github.com/speechmatics/speechmatics-js-sdk/issues

Box333 commented 3 weeks ago

Hi nickgerig, i have removed my comment here and opened an issue as a feature request. https://github.com/speechmatics/speechmatics-js-sdk/issues/48 thank you