vmg / sundown

Standards compliant, fast, secure markdown processing library in C
1.99k stars 385 forks source link

Providing own memory functions (malloc, calloc, realloc, free). #107

Closed txdv closed 2 years ago

txdv commented 12 years ago

It would be nice to somehow be able to provide your own memory functions. I am writing a C# wrapper and with this functionality I could create all the byte arrays (C# byte[] arrays, virtual machine objects) and provide the pointers of these C# objects to the C functions. This could improve performance dramatically compared to marshalling the buffers from the native to the managed world (I have to copy all the data in the buffers).

The memory functions in the markdown parser code are rarely used, while in the buffer code they are used all the time (at least I think so).

The interface would look like that:

struct buf *bufnew(malloc_cb, size_t);
void bufrelease(free_cb, struct buf *);
int bufgrow(realloc_cb, struct buf *, size_t);
void bufreset(free_cb, struct buf*);

Only 4 functions would have to change, or we could add additional overloads for these 4 functions so it would be up to the user of the library to choose if he wants to supply his own memory management functions or not.

This is not very hard to do and I'll will come up with a patch on my own, but I want to hear what you guys think.

vmg commented 12 years ago

This is a nice idea. I'm looking into it.

txdv commented 12 years ago

https://github.com/txdv/sundown/commit/db72b07f91f417a95a308a50eb7d5d125fce41f4

This is how I implemented it. Very simple and straight forward. Too bad performance dropped down once I supplied my own allocation functions.

vmg commented 12 years ago

That does indeed look straightforward. Would you be so kind as to show me some benchmarks of the so called performance drop?

Also: note that you're not initializing the malloc callbacks when calling the old bufnew, so the code will segfault.

txdv commented 12 years ago

The performance drop is related to the internal workings in the mono engine, calling delegates of .net functions as function pointers(consumable for C) is slow - it is a pure mono runtime related performance hit(maybe i will have more luck in the .net framework), in other words - providing C# memory functions as C callbacks is slow especially if they are called frequently and in this particular case they are.

While it didn't work out in my case, the change could be still useful if you are using custom memory functions. This change makes the struct by 2 pointers bigger and introduces maybe 4 more assembler instructions, there is virtually no impact on the C level while it provides some nice functionality (though I personally might have no use for it). I will try to implement it right in the mono virtual machine, just for the sake of fun.

On your note: I don't understand it, I have tested it like that and it works fine. malloc turns out to be used only for the creation of the struct buf therefore we don't need to save it, realloc only for the data (initialization and resizing) and free for freeing both the data and the buf struct.

txdv commented 12 years ago

OK I have figured it out. Turns out with some intelligent buffer handling I can circumvent the invocation of realloc all the time (just set size = 0 instead of releasing the buffer). This makes using internal .net objects for the buffers faster than the c malloc functions. Yeah!

txdv commented 11 years ago

Could you look into this again? The first comment on performance hit was because of my own stupidity, after a few adjustments this resulted in superior performance.