commonmark / cmark

CommonMark parsing and rendering library and program in C
Other
1.6k stars 534 forks source link

Keep an incremental count of open block types #471

Closed kevinbackhouse closed 1 year ago

kevinbackhouse commented 1 year ago

This is a rebase of the solution that we implemented in cmark-gfm to fix https://github.com/github/cmark-gfm/security/advisories/GHSA-66g8-4hjf-77xh.

The idea is to incrementally keep an up-to-date count of the number of "open" blocks, categorized by node type, so that the loop in check_open_blocks can bail out early when it can see from the counts that the rest of the list doesn't contain any nodes that need to be checked.

I've fuzzed this solution extensively on cmark-gfm, but not on cmark.

Reproduction steps for the bug that this PR fixes:

python3 -c 'n = 10000; print(" -" * n + "x" + "\n" * n)' | cmark

Increasing the number 10000 in the above command causes the running time to increase quadratically.

kevinbackhouse commented 1 year ago

Out of interest, I'm surprised that the checks ran automatically. I'm a first-time contributor to cmark, so normally would have to wait for approval to run the checks. Have you modified something in the settings to allow the checks to run for first-time contributors?

nwellnhof commented 1 year ago

This fix is incomplete. Please don't merge yet.

@kevinbackhouse See my report at HackerOne.

jgm commented 1 year ago

Yes, I've modified the settings so that no approval is needed except for new GH-ers.

kevinbackhouse commented 1 year ago

I haven't seen the HackerOne report yet (I'm not on the team that has access), but I've independently figured out what the problem is. I think it means that this strategy of counting the number of open blocks won't work, and we have to go back to the original solution of putting an arbitrary limit on the nesting depth.