commonmark / cmark

CommonMark parsing and rendering library and program in C
Other
1.63k stars 546 forks source link

Creating a PHP wrapper example #470

Closed CViniciusSDias closed 6 months ago

CViniciusSDias commented 1 year ago

Adding a PHP wrapper example using FFI.

Instead of using stdin, I chose to have a default markdown just to be more simple to execute the test. But that can be changed if preferred.

CViniciusSDias commented 6 months ago

Hey there, @jgm I just rebased this old PR since it was failing even though it's just the addition of an example. 😅

I'm just pinging you because since this is an old PR, I know it's easy for you to lose track of it.

nwellnhof commented 5 months ago

This leaks memory just like the Python and Ruby wrappers, see #544.

CViniciusSDias commented 5 months ago

I'll add the free call, but PHP frees it after the variable goes out of scope, so there's no memory leak here. :-D

jgm commented 5 months ago

Ideally these wrappers should define a function that can be imported and used in other code. (Perhaps in a long-running process.) The rb and py wrapper examples do that; your PHP example doesn't even try to do it, but it would be better to. In that case freeing the memory would be important. [Maybe? I don't know anything about PHP, so perhaps even in this case it would be freed automatically?]

CViniciusSDias commented 5 months ago

Yes, it would be freed automatically because the variable would be created inside that function and after the function's execution it leaves out of scope. But I'll still create the PR today adding the free call because it's better to have that explicitly so that people know that it's a resource that can/should be freed when used in a context where the variable will live.

nwellnhof commented 5 months ago

Yes, it would be freed automatically

No, it wouldn't. You receive a char pointer from libcmark and the FFI implementation cannot know whether or how the data should eventually be freed.

CViniciusSDias commented 5 months ago

PHP's implementation of FFI makes that CData pointers free their resources once they go out of scope.

CViniciusSDias commented 5 months ago

https://github.com/php/php-src/blob/ab589e4481f0cf35c8773e0c64dccc35b8870ae1/ext/ffi/ffi.c#L2389

nwellnhof commented 5 months ago

The FFI implementation only frees "owned" data which was allocated by FFI itself. It cannot know about ownership of random pointers returned from C functions. Simply run the wrapper with valgrind to see the leak:

$ LD_PRELOAD=build/src/libcmark.so LD_LIBRARY_PATH=build/src valgrind --leak-check=full php wrappers/wrapper.php
==5016== Memcheck, a memory error detector
==5016== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==5016== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==5016== Command: php wrappers/wrapper.php
==5016== 
==5016== 
==5016== HEAP SUMMARY:
==5016==     in use at exit: 96 bytes in 2 blocks
==5016==   total heap usage: 29,902 allocs, 29,900 frees, 4,821,847 bytes allocated
==5016== 
==5016== 80 bytes in 1 blocks are definitely lost in loss record 2 of 2
==5016==    at 0x484DCD3: realloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==5016==    by 0x4861177: xrealloc (cmark.c:23)
==5016==    by 0x4860AB1: cmark_strbuf_grow (buffer.c:56)
==5016==    by 0x48609D0: S_strbuf_grow_by (buffer.c:34)
==5016==    by 0x4860C89: cmark_strbuf_put (buffer.c:104)
==5016==    by 0x4862D31: houdini_escape_html (houdini_html_e.c:56)
==5016==    by 0x4863433: escape_html (html.c:20)
==5016==    by 0x4863EA1: S_render_node (html.c:224)
==5016==    by 0x48643F4: cmark_render_html (html.c:339)
==5016==    by 0x4861206: cmark_markdown_to_html (cmark.c:44)
==5016==    by 0x8F75E2D: ???
==5016==    by 0x8F72492: ???
==5016== 
==5016== LEAK SUMMARY:
==5016==    definitely lost: 80 bytes in 1 blocks
==5016==    indirectly lost: 0 bytes in 0 blocks
==5016==      possibly lost: 0 bytes in 0 blocks
==5016==    still reachable: 16 bytes in 1 blocks
==5016==         suppressed: 0 bytes in 0 blocks
==5016== Reachable blocks (those to which a pointer was found) are not shown.
==5016== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==5016== 
==5016== For lists of detected and suppressed errors, rerun with: -s
==5016== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
CViniciusSDias commented 5 months ago

Yeah, I just saw the following line, which only cleans owned data: https://github.com/php/php-src/blob/ab589e4481f0cf35c8773e0c64dccc35b8870ae1/ext/ffi/ffi.c#L2392C19-L2392C22