rochars / wavefile

Create, read and write wav files according to the specs. :star: :notes: :heart:
MIT License
226 stars 48 forks source link

Use with Node.js and huge Wave File (490MB) Heap Out of Memory #7

Closed chrisspiegl closed 6 years ago

chrisspiegl commented 6 years ago

Hi,

I finally found this library. I was looking for something like it for a long time. But sadly it appears to have pretty strong limitations (or I have not found the documentation for my problem):

In my scenario, I want to use the library to update the CUE points within a node.js local processed script. The files I am editing are podcasts that are 1 hour long - thus the WAV files tend to be around 500MB. The library seems to only handle smaller files (I am getting the JavaScript heap out of memory with the big file, but not with a 1-minute file that's only 15MB in size).

So my question is two-fold:

  1. Is the library not able to handle the big files at all? Is it a limitation of Node.JS or the library? Or am I missing something?
  2. If the library is not able to handle big files by design, is there documentation of this limitation? I have not found one.

I'd really love to be able to use this lib. It's an amazing project and it would be a lifesaver for me. Sadly right now I can't really make much use of it.

Looking forward to working with WaveFile!

Chris

P.S.: I also tried setting the node --max-old-space-size=8192 to ridiculously high values. Nothing worked.

chrisspiegl commented 6 years ago

As an addon, I'd like to provide the full error messages readout. I also attached the processed profile log to this comment.

It clearly shows that the issue seems to be inside the C++ section of the script.

My other assumptions are that it's because the script tries to read the full WAV file into memory instead of processing it in chungs. Maybe there is a way to make it process the buffer and not store all of it? Just the meta data? I am not that deep in it obviously, but I hope for a solution 🤞 .

Maybe it helps to better understand the issue at hand:

processed.txt: the profiling readout

  % node index.js --max-old-space-size=14192 --optimize-for-size                                                !1210
WaveFile processing started 2018-06-20T05:16:41.626Z
Loading WaveFile:  ./test-cat-org.wav
Reading WaveFile

<--- Last few GCs --->

[13448:0x104000000]      443 ms: Mark-sweep 17.7 (23.1) -> 14.1 (21.4) MB, 0.7 / 0.0 ms  (+ 4.3 ms in 6 steps since start of marking, biggest step 1.2 ms, walltime since start of marking 309 ms) finalize incremental marking via stack guard GC in old space[13448:0x104000000]     5027 ms: Mark-sweep 770.1 (777.4) -> 605.1 (612.4) MB, 174.2 / 0.0 ms  (+ 18.0 ms in 4 steps since start of marking, biggest step 6.6 ms, walltime since start of marking 4489 ms) allocation failure GC in old space requested

<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x67fc5ca5ee1 <JSObject>
    2: SimpleSlice(aka SimpleSlice) [native array.js:1] [bytecode=0x67f3437e591 offset=41](this=0x67f1c702311 <undefined>,p=0x67f67c02751 <Uint8Array map = 0x67f6d241d91>,O=0,P=490490104,Q=490490104,R=0x67f67c02731 <JSArray[490490104]>)
    4: ArraySliceFallback [native array.js:1] [bytecode=0x67f3437dee9 offset=281](this=0x67f67c02751 <Uint8Array map = 0x67f6d241d91>,at=0x67f1c702311 <undefined>...

FATAL ERROR: invalid table size Allocation failed - JavaScript heap out of memory
 1: node::Abort() [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 2: node::FatalException(v8::Isolate*, v8::Local<v8::Value>, v8::Local<v8::Message>) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 3: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 4: v8::internal::HashTable<v8::internal::SeededNumberDictionary, v8::internal::SeededNumberDictionaryShape>::EnsureCapacity(v8::internal::Handle<v8::internal::SeededNumberDictionary>, int, v8::internal::PretenureFlag) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 5: v8::internal::Dictionary<v8::internal::SeededNumberDictionary, v8::internal::SeededNumberDictionaryShape>::Add(v8::internal::Handle<v8::internal::SeededNumberDictionary>, unsigned int, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyDetails, int*) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 6: v8::internal::(anonymous namespace)::DictionaryElementsAccessor::AddImpl(v8::internal::Handle<v8::internal::JSObject>, unsigned int, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, unsigned int) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 7: v8::internal::JSObject::AddDataElement(v8::internal::Handle<v8::internal::JSObject>, unsigned int, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::Object::ShouldThrow) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 8: v8::internal::JSObject::DefineOwnPropertyIgnoreAttributes(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::Object::ShouldThrow, v8::internal::JSObject::AccessorInfoHandling) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 9: v8::internal::JSObject::CreateDataProperty(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::Object::ShouldThrow) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
10: v8::internal::Runtime_CreateDataProperty(int, v8::internal::Object**, v8::internal::Isolate*) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
11: 0x2e89153c6838
12: 0x2e8915475209
[1]    13448 abort      node index.js --max-old-space-size=14192 --optimize-for-size
rochars commented 6 years ago

Thank you for all this information. WaveFile 7 will have improvements in the way the samples are packed/unpacked, so dealing with large files should be easier soon.

Cheers! Rafael

chrisspiegl commented 6 years ago

That sounds very promising @rochars. Do you have any estimation on an ETA of v7? No pressure, just curious 👍.

rochars commented 6 years ago

Version 7 should be out in a few days. You may try the alpha release at your own risk:

npm install wavefile@7.0.0-alpha.10

The alpha release is already much faster than v6.x, but I still need to make tests with big files.

Cheers!

chrisspiegl commented 6 years ago

That sounds pretty awesome.

I tested it with 7.0 Alpha 10 and got a very similar error message:

The programm I am running is litterally just reading the file and showing the cue points:

const fs = require('fs');
const wavefile = require('WaveFile');
let filePath = './test-cat-org.wav';
let wavFileBuffer = fs.readFileSync(filePath);
let wav = new wavefile.WaveFile(wavFileBuffer);
console.log(wav.cue.points);

Console Log:

WaveFile processing started 2018-06-22T10:39:33.323Z
Loading WaveFile:  ./test-cat-org.wav
Reading WaveFile

<--- Last few GCs --->

[29385:0x102800600]     2377 ms: Scavenge 391.6 (398.9) -> 387.7 (398.9) MB, 0.1 / 0.0 ms  allocation failure
[29385:0x102800600]     2384 ms: Scavenge 391.6 (398.9) -> 387.7 (398.9) MB, 0.0 / 0.0 ms  allocation failure
[29385:0x102800600]     2389 ms: Scavenge 391.6 (398.9) -> 387.7 (398.9) MB, 0.0 / 0.0 ms  allocation failure
[29385:0x102800600]     2395 ms: Scavenge 391.6 (398.9) -> 387.7 (398.9) MB, 0.0 / 0.0 ms  allocation failure

<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x2eb5445a5ee1 <JSObject>
    1: push(this=0x2eb5b04de449 <JSArray[50139473]>)
    2: unpackArrayFrom(aka unpackArrayFrom) [/Users/spieglio/Desktop/cue/node_modules/byte-data/dist/byte-data.cjs.js:~300] [pc=0x27bbf9a06e41](this=0x2eb5a7f02311 <undefined>,/* anonymous */=0x2eb5b04de3f9 <Uint8Array map = 0x2eb519d41d91>,/* anonymous */=0x2eb5b04de3c9 <Object map = 0x2eb519d54271>,/* anonymous */=46,/* anonymous */=49042074...

FATAL ERROR: invalid array length Allocation failed - JavaScript heap out of memory
 1: node::Abort() [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 2: node::FatalException(v8::Isolate*, v8::Local<v8::Value>, v8::Local<v8::Message>) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 3: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 4: v8::internal::Heap::AllocateUninitializedFixedDoubleArray(int, v8::internal::PretenureFlag) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 5: v8::internal::Factory::NewFixedDoubleArray(int, v8::internal::PretenureFlag) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 6: v8::internal::(anonymous namespace)::ElementsAccessorBase<v8::internal::(anonymous namespace)::FastPackedDoubleElementsAccessor, v8::internal::(anonymous namespace)::ElementsKindTraits<(v8::internal::ElementsKind)4> >::GrowCapacityAndConvertImpl(v8::internal::Handle<v8::internal::JSObject>, unsigned int) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 7: v8::internal::(anonymous namespace)::ElementsAccessorBase<v8::internal::(anonymous namespace)::FastPackedDoubleElementsAccessor, v8::internal::(anonymous namespace)::ElementsKindTraits<(v8::internal::ElementsKind)4> >::Add(v8::internal::Handle<v8::internal::JSObject>, unsigned int, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, unsigned int) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 8: v8::internal::JSObject::AddDataElement(v8::internal::Handle<v8::internal::JSObject>, unsigned int, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::Object::ShouldThrow) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 9: v8::internal::Runtime_SetProperty(int, v8::internal::Object**, v8::internal::Isolate*) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
10: 0x27bbf988463d
[1]    29385 abort      node index.js
rochars commented 6 years ago

Version 7 should be able to handle files around 400mb. You can try to load them online: https://rochars.github.io/wavefile/example/

chrisspiegl commented 6 years ago

Sounds great, I'd love to test that, but the Drop in does not work in the latest Chrome I am using. And there is no alternative Button to load a file. At this point for me, it just opens the file in the Chrome Tab without the Waveplayer.

chrisspiegl commented 6 years ago

I ran an update on my test script and tested it with version 7.0.1 of WaveFile. It seems the problem persists at least with files > 400MB. I'd hope to see a version that can go even over that.

Maybe there'd be a way to make a differnt loading process for Meta Data editing vs. the full file encoding. I have read that the meta data can be edited with C libraries and pretty much does not need the whole file to be touch? I might be wrong on this.

Thanks for your work, it's great to see progress being made!

This is my output:

 % node index.js                                                         !1357
WaveFile processing started 2018-06-28T00:48:37.802Z
Loading WaveFile:  ./test-cat-org.wav
Reading WaveFile

<--- Last few GCs --->

[18536:0x104000600]     2563 ms: Scavenge 391.6 (398.9) -> 387.6 (398.9) MB, 0.1 / 0.0 ms  allocation failure
[18536:0x104000600]     2568 ms: Scavenge 391.6 (398.9) -> 387.6 (398.9) MB, 0.1 / 0.0 ms  allocation failure
[18536:0x104000600]     2574 ms: Scavenge 391.6 (398.9) -> 387.6 (398.9) MB, 0.1 / 0.0 ms  allocation failure
[18536:0x104000600]     2580 ms: Scavenge 391.6 (398.9) -> 387.6 (398.9) MB, 0.0 / 0.0 ms  allocation failure

<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x3269f7aa5ee1 <JSObject>
    1: push(this=0x3269c515a4c9 <JSArray[50139473]>)
    2: unpackArrayFrom [/Users/spieglio/Desktop/cue/node_modules/byte-data/dist/byte-data.cjs.js:~546] [pc=0x102718f068cd](this=0x3269032edaa1 <Object map = 0x326935b4eb21>,/* anonymous */=0x3269032eda09 <Uint8Array map = 0x326935b41d91>,/* anonymous */=0x3269c515a499 <Object map = 0x326935b52dd1>,/* anonymous */=46,/* anonymous */=490420746)
...

FATAL ERROR: invalid array length Allocation failed - JavaScript heap out of memory
 1: node::Abort() [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 2: node::FatalException(v8::Isolate*, v8::Local<v8::Value>, v8::Local<v8::Message>) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 3: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 4: v8::internal::Heap::AllocateUninitializedFixedDoubleArray(int, v8::internal::PretenureFlag) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 5: v8::internal::Factory::NewFixedDoubleArray(int, v8::internal::PretenureFlag) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 6: v8::internal::(anonymous namespace)::ElementsAccessorBase<v8::internal::(anonymous namespace)::FastPackedDoubleElementsAccessor, v8::internal::(anonymous namespace)::ElementsKindTraits<(v8::internal::ElementsKind)4> >::GrowCapacityAndConvertImpl(v8::internal::Handle<v8::internal::JSObject>, unsigned int) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 7: v8::internal::(anonymous namespace)::ElementsAccessorBase<v8::internal::(anonymous namespace)::FastPackedDoubleElementsAccessor, v8::internal::(anonymous namespace)::ElementsKindTraits<(v8::internal::ElementsKind)4> >::Add(v8::internal::Handle<v8::internal::JSObject>, unsigned int, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, unsigned int) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 8: v8::internal::JSObject::AddDataElement(v8::internal::Handle<v8::internal::JSObject>, unsigned int, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::Object::ShouldThrow) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
 9: v8::internal::Runtime_SetProperty(int, v8::internal::Object**, v8::internal::Isolate*) [/Users/spieglio/.nvm/versions/node/v8.9.4/bin/node]
10: 0x102718d8463d
11: 0x102718dd5d95
[1]    18536 abort      node index.js
rochars commented 6 years ago

There is an alpha of version 8: npm install wavefile@8.0.0-alpha.0

I successfully tested reading and writing a 1.1g file. I'm working on the release. Thank you very much for your detailed input 👍

interleave() and deInterleave() where removed from the API, so watch out for that.

Also note that, at least for now, changing the bit depth or applying compression on huge files will throw the same memory error.

chrisspiegl commented 6 years ago

Thanks, I tested reading the file and it worked like a charm.

Not to worry. The only things I am currently concerned with is changing the Cue Markers with a script (podcast editing and creating/updating show notes with the help of an editor). The script I am buiding is supposed to remove some reencodig steps (currently the workflow is: markers from audition => text file => edit them there => reimport into audition => export the file => create chatper MP3 with Forecast by Marco Arment => upload).

Thanks for the support @rochars. Looking forward to spend some time on this over the next week.