seanmdick / cld2

Automatically exported from code.google.com/p/cld2
0 stars 0 forks source link

Eliminate redundancy and/or simplify default case for compiling unittest_data.h #22

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
internal/unittest_data.h seems to use a mixture of escape sequences and raw 
non-ASCII text. For maximum portability and safety, it would be best for the 
source code to use all ASCII characters and escape the non-ASCII characters. 
This should help compiler compatibility, though there are no reports of 
breakage since this Chromium.org issue back in 2009:

https://code.google.com/p/chromium/issues/detail?id=20033

The change should be simple enough, and a script can be written to perform the 
transformation.

Original issue reported on code.google.com by andrewha...@google.com on 26 Aug 2014 at 3:27

GoogleCodeExporter commented 9 years ago
unittest_data.h has two copies of all the data, one in UTF-8 encoding and one 
in standard C-string hex escape sequences.

The UTF-8 encoded data is often easier for humans to read. For compilers that 
do not support UTF-8 strings, compile with
  -Davoid_utf8_string_constants

Did I miss something here, or did you fail to read the comments at the front of 
unittest_data.h ? /dick

Original comment by dsi...@google.com on 23 Oct 2014 at 8:13

GoogleCodeExporter commented 9 years ago
I missed the comment at the top of the file. Thanks.

Devil's advocate question:
Given that -Davoid_utf8_string_constants should work on all compilers while the 
default (without the define) might break, and that the stated reason for the 
UTF-8 is for human readability, why not just turn all the UTF-8 strings into 
single-line comments and be done with it? Alternatively, why not invert the 
#ifdef logic? There doesn't seem to be an obvious reason to have both copies of 
the data at all.

OTOH, we could use the redundant copy to verify that indeed your compiler did 
compile the UTF-8 text correctly, i.e. that there is a perfect binary match 
between both copies of the string.

My vote would be to invert the ifdef so that the default case works everywhere.

I'll set this back to assigned for now, but change defect to enhancement since 
and priority to low since it is working as intended :)

Original comment by andrewha...@google.com on 24 Oct 2014 at 7:08

GoogleCodeExporter commented 9 years ago

Original comment by andrewha...@google.com on 24 Oct 2014 at 7:08

GoogleCodeExporter commented 9 years ago

Original comment by andrewha...@google.com on 24 Oct 2014 at 7:09