mity / md4c

C Markdown parser. Fast. SAX-like interface. Compliant to CommonMark specification.
MIT License
776 stars 146 forks source link

Are size of these two arrays correct #26

Closed paladin-t closed 6 years ago

paladin-t commented 6 years ago

I'm using VC++2015, and the compiler complains about something. Most of them seem Ok to be ignored, but I'm worried about size of two arrays:

1210 static const CHAR open_str[9] = _T("<![CDATA[");

4287 static const CHAR indent_str[16] = _T(" ");

It says the arrays are not big enough to contain endding '\0's.

Is it on purpose?

Thanks!

mity commented 6 years ago

C standard allows such initialization (the strings are not then terminated with '\0').

In general, MD4C internally does not depend on string terminators '\0' almost anywhere because Markdown documents are allowed to contain '\0' characters. In the two pointed cases, the string length is then determined by the macro SIZEOF_ARRAY(), not strlen() so it is IMO correct.

However I might consider rewriting it differently to suppress the warning.

paladin-t commented 6 years ago

Thanks for the quick reply and clarification.

It's interesting that markdown is allowed to contain '\0' among documents.

mity commented 6 years ago

CommonMark specification says:

Any sequence of characters is a valid CommonMark document.

(spec 0.28, section 2.1)

It also explicitly explains dealing with null chars:

For security reasons, the Unicode character U+0000 must be replaced with the REPLACEMENT CHARACTER (U+FFFD).

(spec 0.28, section 2.3)

mity commented 6 years ago

Strange, on my machine MSVC 2015 does not warn about those things. Maybe you are using some stricter compiler setting then the default one?

Nevertheless, I improved those two as this way it is safer in case of any code change: One does not have to count the characters manually to make sure it is okey. :-)

paladin-t commented 6 years ago

Yep, I was using warning level /W4. That's appreciated of dismissing my concern.

Keep up the nice work!