sg22-c-cpp-standard-compatibility / sg-compatibility

A joint Study Group between the C (WG14) and C++ (WG21) Committees to ensure the longterm synchronization and cooperation of the C and C++ programming languages where any mutual interests lie.
1 stars 0 forks source link

CWG #350: signed char underlying representation for objects #31

Open NinaRanns opened 10 months ago

NinaRanns commented 10 months ago

From conversation with Jens Maurer : "comments from two decades ago point to inconsistencies in the C standard; it would be good to know whether those inconsistencies still exist, and if so, what WG14 wants to do about them."

from Issue #350: Sent in by David Abrahams:

Yes, and to add to this tangent, 6.8.2 [basic.fundamental] paragraph 1 states "Plain char, signed char, and unsigned char are three distinct types." Strangely, 6.8 [basic.types] paragraph 2 talks about how "... the underlying bytes making up the object can be copied into an array of char or unsigned char. If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value." I guess there's no requirement that this copying work properly with signed chars!

Notes from October 2002 meeting:

We should do whatever C99 does. 6.5p6 of the C99 standard says "array of character type", and "character type" includes signed char (6.2.5p15), and 6.5p7 says "character type". But see also 6.2.6.1p4, which mentions (only) an array of unsigned char. Notes from October 2003 meeting:

It appears that in C99 signed char may have padding bits but no trap representation, whereas in C++ signed char has no padding bits but may have -0. A memcpy in C++ would have to copy the array preserving the actual representation and not just the value.

March 2004: The liaisons to the C committee have been asked to tell us whether this change would introduce any unnecessary incompatibilities with C.

Notes from October 2004 meeting:

The C99 Standard appears to be inconsistent in its requirements. For example, 6.2.6.1 paragraph 4 says:

The value may be copied into an object of type unsigned char [n] (e.g., by memcpy); the resulting set of bytes is called the object representation of the value. On the other hand, 6.2 paragraph 6 says,

If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. Mike Miller will investigate further.

Notes from the March, 2010 meeting:

The CWG was not convinced that there was a need to change the existing specification at this time. Some were concerned that there might be implementation difficulties with giving signed char the requisite semantics; implementations for which that is true can currently make char equivalent to unsigned char and avoid those problems, but the suggested change would undermine that strategy.

Additional note, November, 2014:

There is now the term “narrow character type” that should be used instead of “byte-character type”.