Emurgo / cardano-serialization-lib

This is a library, written in Rust, for serialization & deserialization of data structures used in Cardano's Haskell implementation of Alonzo along with useful utility functions.
Other
234 stars 125 forks source link

Memory managment of serialization lib on javascript #542

Closed AngelCastilloB closed 1 year ago

AngelCastilloB commented 1 year ago

Hi, I am currently trying to implement proper memory management and disposal of the CSL objects in the projects I have that use this library, but I would like to understand how far I need to go to properly dispose of all the objects, do I need to call free on every object that I create using one of the constructors, or is it enough to free the parent object?, for example, when creating a transaction we need to provide all sort of objects (redeemers, datums, signatures, input, outputs etc), is it enough to free the transaction where these objects have been added to it or we must free all of the objects individually? The same question goes for objects we get from other objects, do we need to manually free all these objects returned by the getters, or is it enough to free the parent object?.

vsubhuman commented 1 year ago

@AngelCastilloB , you do need to free every single object ever produced. Nearly always any data passed into a Rust call or received from Rust is cloned, to stop consuming of values from JS and preserve the usual JS code expectations. This means that there's no direct link between the value you have created in your JS code, e.g. a datum, and the value that is contained within a transaction struct, once you have passed that datum in - that information is cloned at the moment of passing it into a setter.

AngelCastilloB commented 1 year ago

thanks @vsubhuman for the clarification. I just ran into a memory leak that it seems I cant control. from_bech32 from the address object leaks if an invalid address is passed, and doesn't returns an object or a pointer that I can free, this code reproduces the issue:

while (true) {
 try {

// Throws and doesnt returns the object
CSL.Address.from_bech32('37btjrVyb4KC6N6XtRHwEuLPQW2aa9JA89gbnm67PArSi8E7vGeqgA6W1pFBphc1hhrk1WKGPZpUbnvYRimVLRVnUH6M6d3dsVdxYoAC4m7oNj7Dzp')
 } catch(error) {
   console.log(error);
 }
}

After a few seconds you start getting out of bounds error and after that I cant use the library anymore. This is a problem, because we are using this from_bech32 to precisely detect if an address is valid or not, which means eventually if enough invalid address are passed our service stops working, and I cant seems to find a way around this.

vsubhuman commented 1 year ago

@AngelCastilloB , if I remember things correctly the issue is not in this function specifically but in the bindgen processing of JsError from Rust in general. Any time any JS error is thrown from a Rust/WASM call - it leaks memory, as the Rust produced error object is never returned into JS and instead is converted into a JS native thrown error so it cannot be freed.

The only solution to this is to not rely on errors as business-logic, instead check values separately and then only call the function when you expect for the error to not happen.

We can add is_valid_bech32(str) -> bool function, if needed, which will not produce an error and will only return a boolean.

rooooooooob commented 1 year ago

@vsubhuman @AngelCastilloB I believe this is what you're referring to rustwasm/wasm-bindgen#1963 which has been fixed already.

Just update the dependency, no need to add functionality.

rooooooooob commented 1 year ago

Note: at least when we updated it in CML, it had some changes in how errors were handled (since there are both JsValue and JsError types in wasm_bindgen but we also had our own JsError type and some blanket use wasm_bindgen::*; statements made things use that instead). I am not sure if CSL will run into that exact issue since we did quite a bit of refactoring, but just a heads up.

vsubhuman commented 1 year ago

Nice! Thank you, @rooooooooob! We'll look into updating then and might have this resolved by the next version

vsubhuman commented 1 year ago

while (true) { try {

// Throws and doesnt returns the object CSL.Address.from_bech32('37btjrVyb4KC6N6XtRHwEuLPQW2aa9JA89gbnm67PArSi8E7vGeqgA6W1pFBphc1hhrk1WKGPZpUbnvYRimVLRVnUH6M6d3dsVdxYoAC4m7oNj7Dzp') } catch(error) { console.log(error); } }

@AngelCastilloB , this is actually my mistake to agree this is an issue that needs fixing. In your example the memory leak is not caused by the library at all. In the current latest library version just having a raised error does not cause a leak and if you change your example to something like this, it will not demonstrate a leak:

function parseBech32(str) {
    try {
      // Throws and doesnt returns the object
      return CSL.Address.from_bech32(str);
    } catch(e) {}
    return null;
}

while (true) {
  const a = parseBech32('37btjrVyb4KC6N6XtRHwEuLPQW2aa9JA89gbnm67PArSi8E7vGeqgA6W1pFBphc1hhrk1WKGPZpUbnvYRimVLRVnUH6M6d3dsVdxYoAC4m7oNj7Dzp');
  if (a != null) {
    console.log('address: ', a);
  }
}

There will be no memory leak visible. The problem in your code example is that logging an object in JS stops it from being garbage collected, so you just force all produced error instances to be stored forever. Even if you try logging anything, like an empty string, on every single while (true) iteration you will get a huge memory leak, because console logging is just that expensive. E.g.:

while (true) {
  const a = parseBech32('37btjrVyb4KC6N6XtRHwEuLPQW2aa9JA89gbnm67PArSi8E7vGeqgA6W1pFBphc1hhrk1WKGPZpUbnvYRimVLRVnUH6M6d3dsVdxYoAC4m7oNj7Dzp');
  if (a == null) {
    console.log('failed to parse');
  }
}

Causes a leak. But:

let i = 0;
while (true) {
  const a = parseBech32('37btjrVyb4KC6N6XtRHwEuLPQW2aa9JA89gbnm67PArSi8E7vGeqgA6W1pFBphc1hhrk1WKGPZpUbnvYRimVLRVnUH6M6d3dsVdxYoAC4m7oNj7Dzp');
  if (a == null) {
    if (i++ % 100000 === 0) {
      console.log('failed to parse 100K times');
    }
  }
}

Does not cause a leak (actually does, but just 100K times slower).

AngelCastilloB commented 1 year ago

Hi @vsubhuman thanks for the clarification, this was just an example, our production code does not has that log line, our problem is that the library crashes with an out of bound memory error after a while, we tracked down to that CSL.Address.from_bech32(str);. Because of this we have been cleaning up our code to free every object, but even doing so didn't work and the CSL library still crashes eventually. It was also surprising to us that we were getting this error because even if there is a leak, it shouldn't be this significant (our service crashes aprox 2 hours after we start it).

Btw I repeated the experiment with the cardano multiplatform lib, and the crash doesnt happen, I inspected the glue code generated by the CSL:

static from_bech32(bech_str) {
   var ptr0 = passStringToWasm0(bech_str, wasm.__wbindgen_malloc, wasm.__wbindgen_realloc);
   var len0 = WASM_VECTOR_LEN;
   var ret = wasm.address_from_bech32(ptr0, len0);
   return Address.__wrap(ret);
}

vs the CML:

/**
* @param {string} bech_str
* @returns {Address}
*/
static from_bech32(bech_str) {
   try {
       const retptr = wasm.__wbindgen_add_to_stack_pointer(-16);
       const ptr0 = passStringToWasm0(bech_str, wasm.__wbindgen_malloc, wasm.__wbindgen_realloc);
       const len0 = WASM_VECTOR_LEN;
       wasm.address_from_bech32(retptr, ptr0, len0);
       var r0 = getInt32Memory0()[retptr / 4 + 0];
       var r1 = getInt32Memory0()[retptr / 4 + 1];
       var r2 = getInt32Memory0()[retptr / 4 + 2];
       if (r2) {
           throw takeObject(r1);
       }
       return Address.__wrap(r0);
   } finally {
       wasm.__wbindgen_add_to_stack_pointer(16);
   }
}

And it seems that this extra code added is preventing the crash, which does makes me believes that is related to the bug in wasm_bindgen that @rooooooooob mentioned rather than the memory leak itself

vsubhuman commented 1 year ago

@AngelCastilloB , understood. Try version 11.1.1-alpha.1 please. It includes an update to the latest bindgen version, wonder if it will solve your issues.

AngelCastilloB commented 1 year ago

@vsubhuman I apologizes for the late reply, I did tested it and indeed it seems the problem is solved or at least improved, I could run my sample code for a long time without a crash

vsubhuman commented 1 year ago

@vsubhuman I apologizes for the late reply, I did tested it and indeed it seems the problem is solved or at least improved, I could run my sample code for a long time without a crash

@AngelCastilloB, thank you for the feedback! 🙌

vsubhuman commented 1 year ago

A non-beta version 11.1.1 with this change is now published

klntsky commented 1 year ago

We (MLabs) came up with a hack that ties JS objects to WASM-allocated memory using FinalizationRegistry, the goal is to guarantee that free method is called on all objects that are garbage-collected:

https://www.npmjs.com/package/csl-runtime-gc

Well, FinalizationRegistry does not provide strict guarantees, but it works surprisingly well for us in CTL.

lisicky commented 1 year ago

Thanks @klntsky ! It's a good package :) But have you tried to build CSL with WASM_BINDGEN_WEAKREF=1 env ? It might be better if you wanna to have automatic calling free function. This flag do similar things, see the link

klntsky commented 1 year ago

@lisicky thank you! We didn't know about it. Why isn't CSL compiled with it by default?|

UPD: actually WASM_BINDGEN_WEAKREF throws exceptions attempting to double-free at runtime when we run our tests