Open dy opened 3 years ago
I'm curious how would you display it for speech recognition?
Oh! That's interesting work you've done! I like <ruby>
annotations, they seem to be the right solution for annotating audio data.
Little advice regarding 3.: if wrapping each sample in a tag, it can be a problematic rendering task for browser.
Instead it's better to use Selection API.
I was working on the audio rendering demo in the meantime.
Yeah, selection API would do it for highlighting progress; the demo is based on some older transcript player we have at hyper.audio, but we're working this year on a new version of site/app/etc. and we want to use this wavefont to show sounds in-between speech segments, also to allow selection/replay and remixing.
Hi, could you increase the value of --wght: 10;
to 100 or higher from the Usage
section of Readme.md. I think it would be easier to get started. I was confused that my result was a tiny stain.
Anyone know of a demo or example script for taking a .wav file (or any other audio file), and converting it to a string meant to be used with this font?
Would very much be appreciated.
Thank you.
@arthurwolf I used https://github.com/bbc/audiowaveform to get a JSON representation of the waveform, then you have to map the values to characters that represent that glyph you want.
I did this a long time ago for this demo https://hyperaudio-waveform.surge.sh using an older version of this font, check the page source for String.fromCharCode
Ended up finding a full node/ts solution:
// Get the waveform from the audio data. public async waveform(sampling_rate : number = 20) : Promise<number[]> {
// If the buffer is null, throw an error. if(this._wav_buffer === null) throw new Error('AudioFile.ts:waveform() The buffer is null.');
// Convert a buffer to a stream function buffer_to_stream(buffer: Buffer): Readable { const stream = new Readable(); stream.push(buffer); // Push the buffer to the stream stream.push(null); // End of the stream return stream; }
// Extract waveform data from a WAV buffer const get_waveform_from_wav_buffer = async (wav_buffer: Buffer): Promise< number[]> => {
return new Promise((resolve, reject) => {
const reader = new Reader();
const waveform: number[] = [];
// Handle the parsed WAV format reader.on('format', () => {});
// Process audio samples reader.on('data', (chunk:any) => {
const int16_array = new Int16Array(chunk.buffer, chunk.byteOffset, chunk. length / Int16Array.BYTES_PER_ELEMENT); for (let i = 0; i < int16_array.length; i++) {
waveform.push(int16_array[i]);
} });
// Complete processing reader.on('end', () => resolve(waveform));
// Handle errors reader.on('error', (err:any) => reject(err));
// Convert the buffer to a stream and process it const stream = buffer_to_stream(wav_buffer); stream.pipe(reader);
}); };
// Convert this._wav_buffer to a waveform. const waveform = await get_waveform_from_wav_buffer(this._wav_buffer);
// console.log({waveform});
type WaveformReductionResult = { reduced_waveform: number[]; original_sampling_rate: number; reduced_sampling_rate: number; };
// Function to reduce the sampling rate of a waveform to 10Hz const reduce_waveform_sampling_rate = (sampling_rate:number, waveform: number[], duration: number): WaveformReductionResult => { // Calculate the original sampling rate (samples per second) const original_sampling_rate = waveform.length / duration; // Calculate the number of samples in each 0.1 second interval (for 10Hz) const samples_per_interval = Math.floor(original_sampling_rate * (1/ sampling_rate));
// console.log({samples_per_interval, original_sampling_rate, duration}); // Array to hold the reduced waveform const reduced_waveform: number[] = []; // Process the waveform in chunks of samples_per_interval, averaging the values for (let i = 0; i < waveform.length; i += samples_per_interval) { // Calculate the average for the current interval let sum = 0; let count = 0;
for (let j = i; j < i + samples_per_interval && j < waveform.length; j++) { sum += waveform[j]; count++; }
if(count>0){ const average = sum / count; reduced_waveform.push(average); } } return { reduced_waveform, original_sampling_rate, reduced_sampling_rate: sampling_rate // Hz }; };
// Example usage: const audio_data = { duration: this.duration, // Example duration in seconds waveform: waveform // Placeholder for waveform data };
// Get the waveform. const result = reduce_waveform_sampling_rate(sampling_rate, audio_data. waveform.map(r => Math.abs(r)), audio_data.duration);
// Simplify. const simplified : number[] = result.reduced_waveform.map(r => Math.floor(r ));
// Get the maximum value. const max : number = Math.max(...simplified);
// Normalize so that all values are between 0 and 1, and with 2 digits after the dot. const normalized : number[] = simplified.map(r => Math.floor((r / max) * 100) / 100);
// Return. return normalized;
}
On Sat, Mar 23, 2024 at 7:32 AM Laurian Gridinoc @.***> wrote:
@arthurwolf https://github.com/arthurwolf I used https://github.com/bbc/audiowaveform to get a JSON representation of the waveform, then you have to map the values to characters that represent that glyph you want
— Reply to this email directly, view it on GitHub https://github.com/dy/wavefont/issues/27#issuecomment-2016375788, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA2SFPDSAMJDST37R2GYILYZUOZPAVCNFSM4YKQNOHKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBRGYZTONJXHA4A . You are receiving this because you were mentioned.Message ID: @.***>
--
勇気とユーモア
You can do it via wavearea https://dy.github.io/wavearea/ Just drop the file there. The only limitation - it displays 1024 samples blocks.
download generated optimized static font (based on selected settings)select unicode ranges to cover