Open stevengj opened 10 years ago
https://github.com/jgm/CommonMark#running-tests-against-the-spec
To run the tests using an executable $PROG:
python3 test/spec_tests.py --program $PROG
I skimmed through the first 25% or so of failing tests with:
# Note: python3 only
tests = JSON.parse(readall(`python test/spec_tests.py --dump-tests`))
function correct(test)
try
return Markdown.html(Markdown.parse(test["markdown"])) == test["html"]
catch
println("error in $(test["example"]): $(test["section"])")
return false
end
end
failing = filter(x -> !correct(x), tests)
# and to quickly look at results from a failing test
check(n) = (println(repr(failing[n]["markdown"])); println(repr(failing[n]["html"])); println(repr(Markdown.html(Markdown.parse(failing[n]["markdown"])))))
So far I've found the following: cc @one-more-minute
"--\n**\n__\n"
becomes "<p>--\n**\n__</p>\n"
***
should parse to ***
not <em>*</em>
#
s through"#5 bolt\n"
becomes "<p>#5 bolt</p>\n"
)"# foo#\n"
) (not quite true, but have followed spec)~
should also work as fenced code blocks\!
Thanks for taking a look at that, that's a good list to have. What's quite nice is that (perhaps surprisingly) there are very few particularly major things missing.
The main exception is named links – I do have a way to implement them, but just haven't gotten round to it yet. I need to do a tiny bit of refactoring as well, I think.
It would be cool to have some kind of benchmark for performance as well, but I'm much less worried about Markdown.jl being crazy fast as long as it's not too slow to get the job done.
@one-more-minute a word of warning, I only got 25% through the list (so this could double/triple)! Will append anything else major that pops out. I agree it shouldn't be too bad - it's great to have a thorough test/perf. There was a couple of things that raise, IIRC they were from named links.
tab expansion may also be tricky, not sure how to do that. The example I gave above didn't render as I expected (FIXED)! You seemingly need to count chars as you render (or as you parse??).... the game is "render tabs as spaces as if tab stops were length 4".
I don't yet get the subtleties of escaping html characters but allowing some html...
I have fixed a couple of minor things in html rendering, will PR when I go through the entire list.
Ah, maybe the tab expansion should happen prior to main parsing, then it's a bit easier...
I have checked off some (easy ones) of these which are in a local julia branch.
See this article. It would be interesting to see how the Markdown.jl parser etc. compares (in both performance and behavior) to the C99 CommonMark reference implementation.