commonmark / cmark

CommonMark parsing and rendering library and program in C
Other
1.6k stars 527 forks source link

Allow parsing into an existing node #524

Closed Ericson2314 closed 5 months ago

Ericson2314 commented 5 months ago

Variation on #522 for same purpose. A good deal smaller. First commit is implementation, second commit is test.

Ericson2314 commented 5 months ago

@jgm @nwellnhof does this look better?

Ericson2314 commented 5 months ago

I noticed that cmark.3 was stale. I fixed it a a new preparatory commit, but I think it would be nice to either not commit the generated file, or have a CI action that ensures it is up to date.

jgm commented 5 months ago

cmark.3 is updated as part of make all. Since I use the Makefile for releases, it will always be updated before a release.

nwellnhof commented 5 months ago

I would prefer an API like

int cmark_parser_set_target(cmark_parser *parser, cmark_node *target);
cmark_node *cmark_parser_get_target(cmark_parser *parser);

mainly because it makes it easier to support this feature in language bindings. For memory management, the reference from the parser object to the node object must be tracked and the get_target function provides easy access to this reference.

Besides, it would get more and more clumsy if we wanted to support additional settings in the parser constructor.

I'm aware that it's cleaner to avoid mutation of parser state and to set all options when constructing the parser object. Functions like cmark_parser_set_target should return an error if they're called after cmark_parser_feed.

Ericson2314 commented 5 months ago

What if we had a struct cmark_parser_args where everything was nullable // optional? Flexible (don't deal with args you don't care about) and also no new stateful-ness for the parser itself?

nwellnhof commented 5 months ago

Exposing structs in public APIs should be avoided because modifying the struct in future versions would break the ABI. There are some tricks like adding a version field to the struct but that's rather ugly.

The cleanest solution is to have a separate object for parser settings, leading to something like

cmark_parser_settings *settings = cmark_parser_settings_new();
cmark_parser_settings_set_options(settings, CMARK_OPT_SMART);
cmark_parser_settings_set_target(settings, target);
cmark_parser *parser = cmark_parser_new_with_settings(settings);
// ...
cmark_parser_free(parser);
cmark_parser_settings_free(settings);

For a real-world example look at pthread_mutex_init and pthread_mutexattr_t in pthreads. But making such a change to the libcmark API seems a bit intrusive to me.

Ericson2314 commented 5 months ago

Ooo I like that. Making another opaque type to keep some state out of the parser feels good to me. Fewer edge cases, etc., and not too hard to to clients still.