MichaelChirico / r-bugs

A ⚠️read-only⚠️mirror of https://bugs.r-project.org/
20 stars 0 forks source link

[BUGZILLA #16046] Unicode nuls are allowed in strings #5503

Closed MichaelChirico closed 4 years ago

MichaelChirico commented 4 years ago

For a long time, embedding a nul character in a string using \0 has thrown an error.

"\0" ## Error: embedded nul in string: '\0'

However, it is still possible to enter a nul character using Unicode syntax.

"abc\u0000def" ## [1] "abc"

R's behaviour should be consistent between the two specifications of nul. That is, attempting to create strings containing "\u0000" should throw an error.


METADATA

MichaelChirico commented 4 years ago

I agree that these two cases should be handled similarly. The reason for the difference is that they are currently handled by the string building code, and that's different for byte-sized chars versus wide chars, but the detection should probably happen earlier.


METADATA

MichaelChirico commented 4 years ago

Fixed in R-devel; will port to R-patched after 3.1.2 is released.


METADATA