Open nsajko opened 4 years ago
0c9c620f39e56c42802504003fd05664aba670a4 fixed some aliasing issues, but left a "quick-fix" on https://github.com/9fans/plan9port/blob/master/src/libmemdraw/draw.c#L2376
Another class of issues is creating out-of-bounds pointers, using them, or using out-of-bounds array indices; eg. in various implementations of getcallerpc. Yes, you heard right, just creating an out of bounds pointer is undefined behavior.
Related: I remember there was some code which accessed struct elements like they were part of an array. This is incorrect on multiple fronts: it is an out-of-bounds access, and it assumes a particular (null) padding between structure fields.
-fno-strict-aliasing
should for some time be passed by 9c to GCC and Clang to disable strict enforcing of standard C aliasing rules.
It may be worth noting that a lot of the code in question was authored
by the people that created C and that the standard goes off the rails on
occasion. The authors of the Linux kernel have no plans to eliminate
the kernel's use of -fno-strict-aliasing
and it uses
-fno-strict-overflow
liberally. It's a pain and the aliasing rules
are kind of absurd.
https://github.com/9fans/plan9port/blob/master/src/lib9/_p9dialparse.c#L39
This looks like the code is being used exactly as intended for BSD sockets. A sockaddr_storage pointer is designed to be dereferenced after casting to either a sockaddr_in or a sockaddr_in6. This does not validate any aliasing rules. Either it's fine or you have a bug in your libc.
most programmers do not know that signed integers do not wrap around on overflow, rather signed integer overflow is a (programmer) error.
I don't think there exist many ones' complement machines any more. gcc doesn't support ones' complement, though I am not sure whether clang does or not.
Formally it is a class of "undefined behavior", meaning that the compiler (by default) assumes signed integer variables WILL NEVER overflow,
The compiler is free to do so, per the standard. A sane compiler targeting a twos' complement machine should assume twos' complement representation. In the case of gcc, only twos' complement is supported, though for some reason you have to pass a flag to make the compiler assume twos' complement.
and in a situation where a misguided programmer might expect his program to overflow a certain variable, literally anything can happen instead.
It is unfortunate that this "literally anything can happen" line gets
repeated so often. UB is specified in the standard to allow compiler
writers to make assumptions and to warn programmers not to make certain
assumptions, as they may not hold. In this case, you can tell the
compiler to change its assumptions. "Literally anything can happen" is
a bogeyman: UB means that the compiler may assume that either P
or
!P
holds, or it may decline to make an assumption. If it does
anything besides that, you should probably uninstall the compiler.
It's probably a better idea to use "-fwrapv" (which instructs the compiler to assume wrapping) than "-fno-strict-overflow" (which instructs the compiler to make no assumptions about wrapping), but one of the two should probably be used until someone makes a machine that isn't twos' complement and gcc starts targeting that architecture and an OS that runs P9P gets ported to it and someone wants to run P9P on it.
The flag makes
char
be signed.
The code assumes a signed char, the flag allows that assumption.
@pete If you want to discuss C, the standards, or the compilers, I would be glad to do that, but using another channel (preferably email). The reason is that those discussions could get long, but are not quite pertinent to this issue (unless the p9p maintainers think otherwise).
Some things I will say in response to your seeming urging to disregard the standard and use compiler flags for selecting nonstandard dialects:
Right now the compiler flags are not even used by 9c. EDIT: I meant -fno-strict-aliasing
Being more compliant to the standards (actually, GCC and Clang) as opposed to using some ad-hoc dialect of C means being friendlier to possible contributors AND it enables the compilers to generate more efficient code (in the case of strict aliasing and strict signed overflow).
Keep in mind that the compilers' authors are in the best position to reckon about the semantics of C and the way some C code needs to be written to get certain desired machine code. They know what is more and what is less hard to optimize.
As an example, signed overflow being an error enables certain important optimizations. And writing code with that in mind is easy.
I think the biggest problem with C today is that (probably because of its convoluted history) it is being taught wrong, leading to it being used wrong (and then the user gets pissed at the standards committees and compilers' writers).
Consider type T
, which is either less wide than int
, or signed, or both less wide than int
and signed. Usually it is unsigned char
.
There is a class of bugs where there is a left shift of a T
-typed value. The cause of the bug is implicit promotion to integer (if T
is less wide than int
) and the fact that signed integer overflow is an error. In other words, the compiler could assume that the shift expression will never overflow, which was not the programmer's intention.
I think that most of those bugs will be fixed by https://github.com/9fans/plan9port/pull/319 , but some will surely remain; so here is a list of some lines of code with suspicious left shifts:
src/libdraw/getsubfont.c: v = ((src[x/pack] << ((x%pack)*f->bits->depth)) >> (8 - f->bits->depth)) & mask;
src/libmp/port/mpvecdigmuladd.c: x = p1<<(Dbits/2);
src/libmp/port/mpvecdigmuladd.c: x = p2<<(Dbits/2);
src/libmemdraw/fillpoly.c: onehalf = 1 << (fixshift-1);
src/libmemdraw/fillpoly.c: onehalf = 1 << (fixshift-1);
src/cmd/venti/srv/bloom.c: z = 1<<(x&31);
src/cmd/venti/srv/bloom.c: if(!(tab[(x&b->bitmask)>>5] & (1<<(x&31))))
src/cmd/bzip2/lib/blocksort.c:#define SET_BH(zz) bhtab[(zz) >> 5] |= (1 << ((zz) & 31))
src/cmd/bzip2/lib/blocksort.c:#define CLEAR_BH(zz) bhtab[(zz) >> 5] &= ~(1 << ((zz) & 31))
src/cmd/bzip2/lib/blocksort.c:#define ISSET_BH(zz) (bhtab[(zz) >> 5] & (1 << ((zz) & 31)))
src/cmd/bzip2/lib/compress.c: s->bsBuff |= (v << (32 - s->bsLive - n));
src/cmd/upas/bayes/dfa.c: u[(r%4096)/32] |= 1<<(r%32);
src/libmach/dwarfget.c: if(v&(1<<(nb-1)))
src/libauthsrv/passtokey.c: key[i] = (t[i] >> i) + (t[i+1] << (8 - (i+1)));
src/libauthsrv/opasstokey.c: key[n] = (t[n] >> n) + (t[n+1] << (8 - (n+1)));
Plan9port has a whole lot of bugs that stem from the convoluted history of C before and during standardization. Those should all be fixed (I hope nobody disagrees), which will take time.
GCC and Clang provide a few compiler flags for choosing nonstandard variations/dialects of C, so in the mean time we can apply the workaround of adding the needed compiler flags.
-fno-strict-aliasing
should for some time be passed by 9c to GCC and Clang to disable strict enforcing of standard C aliasing rules. Maybe most programmers do not know about the "strict aliasing" rules mandated by the C standard and the compilers. Quoting the GCC manual (GCC is more lax than the standard):So filling variables of one type using pointers to another type, a quite common pattern, is usually incorrect. See the following links for elaboration:
https://blog.regehr.org/archives/959
https://blog.regehr.org/archives/1307
https://web.archive.org/web/20181230041359if_/ http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
https://en.cppreference.com/w/c/language/object
https://www.cl.cam.ac.uk/~pes20/cerberus/pldi16.pdf
https://stackoverflow.com/questions/42178179/will-casting-around-sockaddr-storage-and-sockaddr-in-break-strict-aliasing
I think this is an example from src/lib9/_p9dialparse.c https://github.com/9fans/plan9port/blob/master/src/lib9/_p9dialparse.c#L39
Another compiler option that should probably be turned on by 9c for some time isThis is similar to the "strict aliasing" situation in that most programmers do not know that signed integers do not wrap around on overflow, rather signed integer overflow is a (programmer) error. Formally it is a class of "undefined behavior", meaning that the compiler (by default) assumes signed integer variables WILL NEVER overflow, and in a situation where a misguided programmer might expect his program to overflow a certain variable, literally anything can happen instead.-fno-strict-overflow
.This seems to be an example in the hash function in src/cmd/mk/symtab.c: https://github.com/9fans/plan9port/blob/master/src/cmd/mk/symtab.c#L29 . The fix is simple: use an unsigned type for wraparound overflowing.
EDIT: actually, I think we do not need
-fno-strict-overflow
, as undefined overflow situations are actually easy to catch at runtime using ubsan.The following compiler flag is currently passed by 9c to GCC and Clang:
-fsigned-char
. In standard C it is unspecified if thechar
type is signed or unsigned, thus the programmer must use eitherunsigned char
orsigned char
if the signedness matters. The flag makeschar
be signed. I am mentioning the flag here just so people would know (or would not forget) about the places in the code that depend on it, and thus need work. Or, alternatively, if there is no such code left, the flag could maybe be removed?