0xPolygonID / polygonid-flutter-sdk

Flutter Plugin to use the Polygon ID SDK
Apache License 2.0
40 stars 27 forks source link

free memory allocated by Golang using C.CString() #390

Closed yushihang closed 5 months ago

yushihang commented 6 months ago

close #391

plusema86 commented 6 months ago

@olomix can you take a look a this PR?

olomix commented 6 months ago

Can we simply call the free function directly from Dart and Swift instead of wrapping it in a Go library and calling it from there? It seems a bit unusual to me.

5eeman commented 6 months ago

I'm actually not sure why we're not making something like:

malloc.free(response.value);

Maybe this way we could also get rid of PLGNFreeStatus?

Or I don't know some background of this

plusema86 commented 6 months ago

@yushihang we have already merged the PR 386 with malloc.free() , is it enough?

yushihang commented 6 months ago

@olomix @5eeman @plusema86

Thank you very much for your response and suggestions. For me, having the opportunity to participate in open-source code work is a great learning opportunity.

A concise description of the approach for this PR is: "Following the RAII (Resource Acquisition Is Initialization) pattern, who allocates, who releases".

I will try to express my understanding of this memory management case using the diagram A below:

flutter-release-memory-A

The steps in the diagram are as follows:

  1. Flutter/Dart allocates a block of memory with the size of a pointer to receive return data from Golang

    ffi.Pointer<ffi.Pointer<ffi.Char>> response =
         malloc<ffi.Pointer<ffi.Char>>();

    In other program languages, this variable is possibly allocated on the stack or in a register. There is an issue discussing why Flutter needs to allocate it on the heap using malloc: https://github.com/dart-lang/sdk/issues/42509

  2. Flutter call function provided by Golang, pass the pointer address(response) to Golang:

    int res = PolygonIdCore.nativePolygonIdCoreLib
        .PLGNCreateClaim(response, in1, status);
  3. Golang allocate memory for returning data, write data(string(respB)) into the memory, and point jsonResponse to the memory. (This memory will not be managed by Golang's memory garbage collector)

    *jsonResponse = C.CString(string(respB))
  4. Flutter side return data from Golang allocated memory (jsonResponse)

    ffi.Pointer<ffi.Char> jsonResponse = response.value;
  5. After jsonResponse is no longer needed, it should be released. So we call another function from Flutter/dart to notify Golang to release the memory allocated for jsonResponse in step 3.

    PolygonIdCore.nativePolygonIdCoreLib.PLGNFreeCString(jsonResponse);
  6. Golang release memory for jsonResponse

    C.free(unsafe.Pointer(str))
  7. Flutter /Dart release to memory allocated for the pointer (response) (allcated in step 1)

    malloc.free(response);

After following this procedure, all memory on the heap that was dynamically allocated (the green and red parts in Diagram A) will be properly released, and there will be no memory leak.

PR #386 handles the release logic for the green block of memory in Diagram A, while the current PR handles the release logic for the red block of memory.

Another point that needs to be discussed for current PR is "why we need to wrap a function in Go to release jsonResponse, instead of calling the C free() function directly in flutter/dart". Is that right?

I have attempted to illustrate this implementation method using Diagram B:

flutter-release-memory-B

As can be seen from the diagram, steps 5 and 6 in Diagram A have been combined into a new step 5, which means to call C-style-free() directly from the Flutter/dart side to release the memory of jsonResponse.

In fact, I have considered this method when dealing with the release logic of the red block memory in the Diagram B. For this case, it works. I also found a free() wrapper that is already implemented in the current Polygonid-flutter-sdk code:

```dart
late final _freePtr =
  _lookup<ffi.NativeFunction<ffi.Void Function(ffi.Pointer<ffi.Void>)>>(
      'free');
late final _free =
  _freePtr.asFunction<void Function(ffi.Pointer<ffi.Void>)>();

ffi.Pointer<ffi.Void> realloc(
    ffi.Pointer<ffi.Void> __ptr,
    int __size,
) {
    return _realloc(
    __ptr,
    __size,
    );
}
```

The main difference between Diagram A & B is : Should "the memory allocated by Golang from OS" be released by Golang self, or can it be released by Other module(Flutter)?

The reason for adopting the method in Diagram A in current pull request, i.e. releasing jsonResponse through Go instead of flutter/dart, is mainly due to two considerations:

  1. The release of PLGNStatus and the handling of C-Style String in Rust both follow the pattern in Diagram A, i.e. flutter does not directly call the c free function to release this part of the memory, but instead releases it through the lib that allocates this memory (golang/rust). Therefore, I have also adopted this design.

    //export PLGNFreeStatus
    func PLGNFreeStatus(status *C.PLGNStatus) {
       _, cancel := logAPITime()
       defer cancel()
    
       if status == nil {
           return
       }
    
       if status.error_msg != nil {
           C.free(unsafe.Pointer(status.error_msg))
       }
    
       C.free(unsafe.Pointer(status))
    }

    https://github.com/0xPolygonID/polygonid-flutter-sdk/blame/ad8b622c922c3ac9d86662e7cdf3c22823b592bd/rust/src/lib.rs#L583-L589

    #[no_mangle]
    pub extern fn cstring_free(str: *mut c_char) {
       unsafe {
           if str.is_null() { return }
           CString::from_raw(str)
       };
    }
  2. In my own understanding, although it seems that flutter can directly call c free in this case, but if we consider the following situations, the pattern in Diagram A may be more universal.

    • In case that we add a string variable to PLGNStatus, if we adopt the method in Diagram A, we only need to modify golang, and there is no need to change flutter, which can prevent memory leaks.
    typedef struct _PLGNStatus
    {
           PLGNStatusCode status;
           char *error_msg;
           char *new_string; //added new string variable
    } PLGNStatus;
    
    func maybeCreateStatus(status **C.PLGNStatus, code C.PLGNStatusCode,
       msgFmt string, msgParams ...interface{}) {
    
       if status == nil {
           return
       }
    
       s := (*C.PLGNStatus)(C.malloc(C.sizeof_PLGNStatus))
       if s == nil {
           return
       }
    
       if msgFmt != "" {
           msg := fmt.Sprintf(msgFmt, msgParams...)
           s.error_msg = C.CString(msg)
       } else {
           s.error_msg = nil
       }
    
       s.status = code
    
       s.new_string = C.CString(xxxxxx) //new memory block allocated here
    
       *status = s
    }
    
       //export PLGNFreeStatus
       func PLGNFreeStatus(status *C.PLGNStatus) {
       _, cancel := logAPITime()
       defer cancel()
    
       if status == nil {
           return
       }
    
       if status.error_msg != nil {
           C.free(unsafe.Pointer(status.error_msg))
       }
    
       if status.new_string != nil {
           //free the newly added variable in Golang
           C.free(unsafe.Pointer(status.new_string))
       }
    
       C.free(unsafe.Pointer(status))
    }

    If we use the method in Diagram B, the flutter sdk code also needs to add a free() for new_string synchronously.

    In this way, we must maintain synchronous modifications in two different repositories. If there is no synchronous modification, it will cause memory leaks or other memory management problems.

    In particular, memory leaks, because they may not have an obvious impact on the running results during the development phase, so it is difficult to discover in a timely manner.

    • Regarding the PLGNFreeCString() function that will be added in this PR, there are also the following scenarios:

    PLGNFreeCString() does not only provide a C.free() function for Flutter, but it should be understood as follows:

    1. Since the memory is allocated from the OS by Golang, Flutter does not release it directly.
    2. Flutter/Dart calls PLGNFreeCString() to inform Golang that the memory block that Golang allocated will not be used anymore by Flutter/Dart.
    3. Golang can handle this memory block subsequently. (Golang can then free the memory block by calling free(), or manage the memory block in other ways.)

    Another situation is that memory allocation is not only done through C.CString()/malloc(), but can also be done through a method such as boost::pool, which pre-allocates a large block of memory and manages it through internal marking, without needing to call malloc()/free() provided by OS every time an allocate/release function is called.

    In such cases, Golang may replace the current C.CString()/C.free() calls by linking to a third-party library. If this situation arises, it is not appropriate for Flutter to continue calling C-style-free().

    If we adopt Diagram A's approach, which is the approach taken in this PR, where the responsibility for allocating and releasing memory is borne by Golang, and the actual allocate/release method used is also determined by Golang.

    Then the Flutter code does not need to be modified, even if the native library changes this part of the memory management method. This approach is more in line with the RAII principle of "who allocates, who releases".

A summary in one sentence is: "Memory allocated by Flutter is released by Flutter (green blocks in Diagram A), and memory allocated by Golang is released by Golang (red blocks in Diagram A). "

Actually, the above content is not limited to Flutter and Golang. Any cross-programming language memory management will encounter such situations.

Another issue is: for the module that allocates a block of memory, if this block of memory is released by other modules/languages, but the allocator module has not received a notification, this can also bring potential problems to memory management.

olomix commented 5 months ago

There are two kinds of memory Go can allocate. First, from the internal Go-specific memory allocator that is under Go Garbage Collector control. And second, from the OS that is not under control of the Go garbage collector. In our c-polygonid library we are trying not to return Go-controlled memory. So whatever you do, you can't cause a memory leak inside Go runtime, only in system memory (I hope).

Then other issue is where should we put the code that releases the memory. In C world, if you return a pointer to the complex data structure with embedded pointers, then you should provide a function to release such complex structure. But if you return a pointer to a simple data structure as a string (char ) or an array of ints (int ) and this memory could be released with just a call to a free function, then you do not need to provide a separate function for this.

We design our c-polygonid library to work with strings only: for input and output data. The only "complex" structure we provide is PLGNStatus, that contains embedded pointers to strings with error messages. And we provide a function PLGNFreeStatus to free it. But still this struct does not uses Go-specific memory and you could free it by yourself if you know it's structure and know on what fields you should call free.

There are a lot of functions in standard library that return strings: like asprintf in stdio.h and they do not provide additional function to free those strings. Those allocated strings are supposed to be release with just calling the free function.

yushihang commented 5 months ago

Thank you for the discussion and insights~

So whatever you do, you can't cause a memory leak inside Go runtime, only in system memory (I hope).

Yes, currently, the memory leak only occurs in the memory allocated by C.CString() (red memory in diagram), and not in the memory managed by Go's Garbage Collector.

There are a lot of functions in standard library that return strings: like asprintf in stdio.h and they do not provide additional function to free those strings. Those allocated strings are supposed to be release with just calling the free function.

I agree with your point in c-world.

However, the example is within the same programming language or same module, sometimes this is the chosen implementation because there is no better alternative.

  1. In our case, it's malloc/free across different modules/libraries and even different programming languages, which leads to memory allocation and deallocation operations not being paired in each module.

    A reader of this code might be confused as to why Golang only has C.CString() without C.free(), while Flutter has free() but lacks a corresponding malloc().

  2. Let's take a look at the documentation for asprintf.

    The functions asprintf() and vasprintf() are analogs of
       sprintf and vsprintf, except that they allocate a string
       large enough to hold the output including the terminating null
       byte ('\0'), and return a pointer to it via the first argument.
       This pointer should be passed to free() to release the allocated
       storage when it is no longer needed.

    The Standard C library provides asprintf() and free(), so it require the caller using free() to release the memory allocated in asprintf().

Correspondingly, C-polygonid may provide PLGNCreateClaim() and PLGNFreeCString(), and require the caller using PLGNFreeCString() to release to memory allocated in PLGNCreateClaim(). That sounds reasonable as well.

Considering that c-polygonid provides a C-style API, it may not only be called by Flutter, which means that when calling c-polygonid functions from any other programming language, they will also face this problem (directly calling the language's wrapper for C.free, but without a corresponding malloc call for this free() call ).

  1. Indeed, in cases where we have already constrained Golang to exclusively use C.CString() for memory allocation from the operating system, there won't be any technical issues with calling the corresponding C.free() in Flutter(). It is the same for both string and PLGNStatus. (Assuming that we ensure the allocation and deallocation operations are paired on both language/libs)

    What we're discussing here is more of a design choice~

  2. Additionally, if simple strings can be directly freed by Flutter, but PLGNStatus requires PLGNFreeStatus(), this may lead to inconsistent code styles. However, if we individually free the strings in PLGNStatus in Flutter, it becomes overly coupled. Furthermore, babyjubjub(rust) already provides a cstring_free() functionhttps://github.com/0xPolygonID/polygonid-flutter-sdk/blame/ad8b622c922c3ac9d86662e7cdf3c22823b592bd/rust/src/lib.rs#L584.

    This means that in the Flutter side of the code, there will be some direct calls to C.free (without corresponding malloc), while in others similar codes , PLGNFreeStatus()/cstring_free() is used, will that lead to some inconsistency in coding style? And that's actually why I didn't directly call C.free() on the Flutter side in this PR, haha.

olomix commented 5 months ago

You stress on a "Go language" as something important. The c-polygonid library in the first place is supposed to be just a C-library. It doesn't specifically points that it is written in Go language. So you should work with it as with a C library. Everybody who use it should be familiar only with working with other C libraries and can know nothing about the Go language. That is why, it does not return objects that you can't free by yourself without returning them to the library. You talk about some "exotic" types like C.CString. c-polygonid does not return some CString type. It returns char * type. And you can operate on this data as on any other char pointer in C language of standard library. I prefer to keep it this way as long as possible without complications to return the data where it was created.

As to confusion, from my point of view, it can be confusing why I can't release pointer to char just by calling free. A lot of C programmers may even miss the documentation notion, that you are supposed to call some specific function to release this type of memory.

I can show you another confusing conversation from the other side if function PLGNFreeCString would exist:

[User] What is CString stands for in PLGNFreeCString? [Me] That is a function name that created a char* string inside a c-polygonid library [User] Why should I use PLGNFreeCString instead of free [Me] You shouldn't, they are the same. [User] Then why PLGNFreeCString exists? [Me] Em...

You are right about the usage of the PLGNFreeStatus function. Better to use it always, not to free PLGNStatus by yourself. You can (if you know the structure of the PLGNStatus), but for consistency and convenience, better to use PLGNFreeStatus. If we will add some fields to the PLGNStatus structure in the future, your custom free functionality may be broken at some time. And in this case we provide a new type PLGNStatus and also we provide a function to release this type. In case of char* — we do not provide this type. It is standard.

About babyjubjub cstring_free(). I'm not a Rust expert. But if I understand Rust documentation correctly, Rust returns strings that located inside a Rust's memory. So you can't just release it with a free function. You need to return it back to Rust. That is not a case for the c-polygonid library. It think it possible to return string not from the Rust-owned memory, but original authors of those library chose this way to go. May be for performance, I don't know.

yushihang commented 5 months ago

You talk about some "exotic" types like C.CString

There might be a miscommunication here. I was not talking about C.CString() type, but rather the Golang functions used to allocate memory for C-type variables in the c-polygonid library, such as the following calls:

https://github.com/0xPolygonID/c-polygonid/blob/cc0573dddca20fb0b587b4e415f1b7f9c7a32132/cmd/polygonid/polygonid.go#L422

*jsonResponse = C.CString(string(respB))

The c-polygonid library in the first place is supposed to be just a C-library. It doesn't specifically points that it is written in Go language. So you should work with it as with a C library.

I actually don't think the language used in c-polygonid, be it Go, Rust, or C, would affect this conclusion because my point is who allocates should take the responsibility to release~

[User] What is CString stands for in PLGNFreeCString? [Me] That is a function name that created a char* string inside a c-polygonid library [User] Why should I use PLGNFreeCString instead of free [Me] You shouldn't, they are the same. [User] Then why PLGNFreeCString exists? [Me] Em...

This conversation is a good point to discuss~

Suppose we add documents/comments to PLGNCreateClaim() like asprintf()

//`jsonResponse` should be passed to `PLGNFreeCString()` to release 
//    the allocated storage when it is no longer needed.
extern GoUint8 PLGNCreateClaim(char** jsonResponse, char* in, PLGNStatus** status);

As a programmer calling PLGNCreateClaim(), I would certainly be curious and look into the implementation of PLGNFreeCString(), but I would not choose to directly call free by my side.

Because the caller don't need to maintain the implementation of PLGNFreeCString() and PLGNCreateClaim(), the c-polygonid library will ensure consistent memory management. As long as PLGNFreeCString() and PLGNCreateClaim() are called in pairs according to the document. Everything will be ok.

[User] Why should I use PLGNFreeCString instead of free [Me] You shouldn't, they are the same. As to confusion, from my point of view, it can be confusing why I can't release pointer to char just by calling free

I got your point~

We cannot prevent someone from manually calling free() even if there is a provided function such as PLGNFreeStatus() for memory management. As we discussed earlier, anyone can directly call free() to release the string variable in the PLGNStatus struct instead of calling PLGNFreeStatus().

It is not an issue of "right or wrong", but of "right or better". It's true we can call free() to free the memory in flutter, but an allocate/release pair in a same module will be better to understand and maintain.

In our case, the allocate/release pair in flutter will be: PLGNCreateClaim()/PLGNFreeCString() the allocate/release pair in c-polygonid will be: *jsonResponse = C.CString(string(respB)) / C.free(unsafe.Pointer(str))

so the conversation may be:

[User] What is CString stands for in PLGNFreeCString? [Me] That is a C-standard char* allocated inside c-polygonid library,when you call PLGNCreateClaim() or other similiar functions ealier.

The name PLGNFreeCString() is consistent with the cstring_free() used in the Rust bjj library. This may be helpful for users calling bjj/c-polygonid together, as it suggests that the two functions have similar functionality.

[User] Why should I use PLGNFreeCString instead of free [Me] You can choose to call PLGNFreeCString or free. But if you call PLGNCreateClaim()/ PLGNFreeCString() in pair, it should be better, by avoiding a non-paired free() function exists in your code. Furthermore, when calling the c-polygonid functions from a programming language other than C or Flutter, you do not need to search for a language-specific wrapper for free(). Instead, you can directly call the function PLGNFreeCString provided by us.

[Me] If the implementations of PLGNCreateClaim()/PLGNFreeCString() were to change in the future, you would not need to modify your code, because c-polygonid will always ensure that as long as these two functions are called in pairs, everything will be fine.

[Me] If there is a potential memory leak, for example, we suspect that the memory allocated by the C.CString() function in the golang side has not been completely released. If the callers choose to call PLGNCreateClaim()/PLGNFreeCString() in pairs, we can add logging to print out allocate/release memory addresses or allocate/release function call counts, to locate any missing corresponding calls. However, if memory allocation and release are called in different libraries, debugging will become more difficult.

. . . .

During our discussion, I've gained valuable insights into your design for c-polygon. Once again, thank you for your insights and suggestions! @olomix