Closed hassannaftabb closed 1 year ago
I edited your key out of the post, but you should rotate it because history.
Nothing stands out at being wrong, would it be possible to get a copy of the audio file?
Nothing stands out at being wrong, would it be possible to get a copy of the audio file?
thank you for editing the key, besides i am using the audio file generated by ffmpeg, in wav format, the video is just 15s long and in mp4 format, here is the video: https://drive.google.com/file/d/1vLYOM4710f3OUsNVNUojS3yRfVhg3QoN/view?usp=sharing and here is the audio file: https://drive.google.com/file/d/1SPzZm-DNXIfj5LchiLeLEXctibc9SFLa/view?usp=sharing ... I am wondering could this be an issue in nest js only? also, is there anything i need to change please let me know..
Thanks for the wave file, it was the key to solving this.
The audio file is a wav container with an MP3 codec used for the audio stream. The JS SDK can only handle PCM audio as file input. The wave header parser didn't error out when it found audio in the wav file that wasn't PCM. I'll open a bug in our internal system to respond better when finding an unexpected codec.
You can use pcm_s16le as the ffmpeg output codec to get 16 bit PCM.
ffmpeg.exe -i <in> -vn -acodec pcm_s16le -ar 44100 -ac 2 <out>
Also, you could convert the audio to 16 bit, 16Khz mono:
ffmpeg.exe -i <in> -vn -acodec pcm_s16le -ar 16000 -ac 1 <out>
The SR quality isn't helped by sending higher definition audio.
hi @hassannaftabb, were you able to solve your issue with @rhurey's suggestion?
Closed since no further updates in a month, please open a new issue if more support is needed.
`import { Injectable } from '@nestjs/common'; import { CreateProjectDto } from './dto/create-project.dto'; import as path from 'path'; import as childProcess from 'child_process'; import * as fs from 'fs'; // eslint-disable-next-line @typescript-eslint/no-var-requires const sdk = require('microsoft-cognitiveservices-speech-sdk');
@Injectable() export class ProjectsService { async create(createProjectDto: CreateProjectDto, video: Express.Multer.File) { const tempVideoPath = path.join( dirname, '..', 'temp',
${video.originalname?.trim()?.toLowerCase()}
, ); const outputPath = path.join(dirname, '..', 'audio.wav'); // Change the output path as per your requirement await fs.promises.writeFile(tempVideoPath, video.buffer); const ffmpegCommand =ffmpeg -i ${tempVideoPath} -vn -acodec libmp3lame -ar 44100 -ac 2 ${outputPath}
; const ffmpeg = childProcess.spawn('bash', ['-c', ffmpegCommand]); ffmpeg.on('error', (err) => { console.error(err); }); ffmpeg.on('close', async () => { console.log('Started'); const subscriptionKey = 'key'; const serviceRegion = 'eastus'; // e.g., "westus" const speechConfig = sdk.SpeechConfig.fromSubscription( subscriptionKey, serviceRegion, ); speechConfig.speechRecognitionLanguage = 'en-US'; function fromFile() { const audioConfig = sdk.AudioConfig.fromWavFileInput( fs.readFileSync(outputPath), ); const speechRecognizer = new sdk.SpeechRecognizer( speechConfig, audioConfig, ); speechRecognizer.recognizing = (s, e) => { console.log(RECOGNIZING: Text=${e.result.text}
); };} }
This is the code i am using in nest js. It logs the 'Started' but takes too much to proceed anything futher after after many time it just generates
<--- Last few GCs --->[45684:0x7fca5ee4d000] 173496 ms: Mark-sweep (reduce) 2046.5 (2083.5) -> 2045.8 (2083.8) MB, 20508.3 / 0.0 ms (average mu = 0.162, current mu = 0.000) allocation failure; scavenge might not succeed [45684:0x7fca5ee4d000] 198005 ms: Mark-sweep (reduce) 2047.0 (2084.0) -> 2046.1 (2084.3) MB, 24446.5 / 0.0 ms (average mu = 0.077, current mu = 0.003) allocation failure; scavenge might not succeed
<--- JS stacktrace --->
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory 1: 0x10cb9b0a5 node::Abort() [/usr/local/bin/node] 2: 0x10cb9b295 node::OOMErrorHandler(char const, bool) [/usr/local/bin/node] 3: 0x10cd20c3c v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate, char const, bool) [/usr/local/bin/node] 4: 0x10cee54d5 v8::internal::Heap::FatalProcessOutOfMemory(char const) [/usr/local/bin/node] 5: 0x10cee3d70 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/bin/node] 6: 0x10ced5bea v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/bin/node] 7: 0x10ced6585 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/bin/node] 8: 0x10ceb8e4e v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/usr/local/bin/node] 9: 0x10d2e2c26 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long, v8::internal::Isolate) [/usr/local/bin/node] 10: 0x10d6d0ef9 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit [/usr/local/bin/node] ` this error. Please do let me know if i am doing anything wrong, the audio file i am using is about 15seconds only, still this issue.