Open ampli opened 9 years ago
I'm torn on seeing such stuff. OTOH, this provides useful features, and good way to bootstrap learning of all that ASAN stuff. OTOH, it's complete kludge for somebody who don't need it and just needs clean makelfile. I'm certainly would be OK with this being Makefile.asan.
It could be indeed a good idea to provide it as Makefile.asan. I can send another pull request if desired.
However, I cannot imagine why someone who develops a program (and re1.5 is a program for development, not for the purpose of actually use as a command) would like to avoid compiling with ASan/UBan (for development purpose), now that it is commonly available in gcc and clang.
For example, I tried to run the MicroPython tests, after compiling it (in the unix directory) with ASan/UBan, and it indicated many apparent problems:
A problem like
../py/parse.c:169:46: runtime error: left shift of negative value -1
(arg<<1 when arg is negative, which result is undefined by the C standard.)
reported on 24 source locations. (You may want to cast the variable to size_t).
A problem like:
../py/objint.c:102:16: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
reported on 4 different locations.
A problem like:
../py/emitbc.c:289:9: runtime error: null pointer passed as argument 1, which is declared to never be null
Is reported in 9 locations (on 3 different functions). In these particular cases It usually cannot cause harm but it is still not a good idea to allow that.
There is also one buffer overflow report (gc.c:284), but it may be bogus.
I can open an issue on micropython with the summary of these errors, and the Makefile modifications in order to produce them.
However, I cannot imagine why someone who develops a program (and re1.5 is a program for development, not for the purpose of actually use as a command)
Well, any software is first of all intended for usage, not development ;-). And re1.5 is library, so it's "usage" is integration into other software. The latest "fashion" in that direction is not even using lib*, bu dropping code into your codebase (I don't like that, but then for uPy, we have a good case why it's useful). Entailing from that, there should be very clean and concise instructions how to integrate re1.5 into your codebase, and clean, concise Makefile is such instructions in "executable form". Seeing "llvm-symbolizer" (far less "/usr/bin/llvm-symbolizer", like you propose), about which nobody knows, doesn't help the cause.
would like to avoid compiling with ASan/UBan (for development purpose), now that it is commonly available in gcc and clang.
Cool, we now just need to wait some 20 years till asan/uban will be the same this as "ls" or "ld", and such questions won't even make sense. Until then - they're nothing but novelty-du-jour, about which "nobody" knows and which will die (will be killed by their own creators) before they have chance to become popular.
For example, I tried to run the MicroPython tests, after compiling it (in the unix directory) with ASan/UBan, and it indicated many apparent problems:
"Apparent" is a keyword here. If such a tool (and there're lot of them!) finds a problem, please verify whether it's true problem or false positive, and submit patch if the former. Keep tally of issues vs false positives, if issues are frequent, and false positives are rare, maybe it's indeed worth enabling it by default for everyone, instead of having someone run occasionally and do end-to-end analysis of each case.
I can open an issue on micropython with the summary of these errors, and the Makefile modifications in order to produce them.
Any contribution is appreciated, but per above, the most useful approach is that someone does end-to-end work on this stuff. Otherwise, such reports may just stay long in queue unattended.
It could be indeed a good idea to provide it as Makefile.asan. I can send another pull request if desired.
Per above, that would be preferred solution, at least for the time being.
Seeing "llvm-symbolizer" (far less "/usr/bin/llvm-symbolizer", like you propose), about which nobody knows, doesn't help the cause.
Usually it is already set up in the environment of the developer. However, I could not assume that , so in order I added it to the Makefile. It is far from ideal, of course, and it is indeed not appropriate for a main Makefile.
If such a tool (and there're lot of them!)
Just please note that this is not "a tool" among "a lot of them" other tools. It is an official integral feature of the GCC and CLANG compilers for some time now (2+ years). However, UBsan and Lsan are newer in GCC.
When GCC issues a warning on an undefined behavior, unless it is in dead code (a thing that may be rare enough now, I didn't encounter such a thing recently), it is not a good idea to let it remain in the code, because compiler optimizations assume you don't do undefined things, and optimize code according to this assumption (and e.g. remove code you didn't think it would remove).
Interesting reading: What Every C Programmer Should Know About Undefined Behavior GCC Undefined Behavior Sanitizer
ASan (address sanitizer) is a great feature of GCC that usually don't have false positives on -O2 code (on -O3 I encountered false positives). For example, it doesn't let you referring an array out of bounds, using indexing or even pointers. This way you can easily find bugs without tedious debugging sessions.
In the uPy tests, ASan found a repeated out-of-bound access in one particular place (gc). If uPy unwind the stack itself (not using longjmp), then this may be a false positive. Otherwise, most probably it indicates a problem. I'm not sure I will be able to enter deeply into that.
To sum up:
In any case, my propose Makefile has a test target that runs run-tests. Maybe it can be a good idea to add it to the project's Makefile (I can send a pull request).
To sum up:
- My propose Makefile can be installed as Makefile.asan (by modifying the existing pull request).
- I can send another pull request of Makefile.asan.
- We can forget it (I have no problem with that...).
In any case, my propose Makefile has a test target that runs run-tests. Maybe it can be a good idea to add it to the project's Makefile (I can send a pull request).
Is there anything to do more regarding that? If not, this can be closed.
Can we please choose option 2 above? Thanks.
In any case, my propose Makefile has a test target that runs run-tests. Maybe it can be a good idea to add it to the project's Makefile (I can send a pull request).
Yes, that's good idea, I'm missing that too.
ping
Other possible improvements to the Makefile (implemented in my devel branch). I can send a pull request (each in a different commit) if they look fine. In any case, I propose to include all of them in Makefile.asan (to be sent next).
(gdb) p NON_ANCHORED_PREFIX
$1 = 5
(gdb) info macro NON_ANCHORED_PREFIX
Defined at /usr/local/src/re1.5/makefile1/re1.5.h:135
included at /usr/local/src/re1.5/makefile1/main.c:5
#define NON_ANCHORED_PREFIX 5
#CFLAGS=-DDEBUG -ggdb3 -Wall -O0
I propose -O0 for debug sessions because -Os enables -O2 which may optimized-out variables and change the order of exaction, a thing that makes debugging using gdb inconvenient.The current Makefile uses TARG=re, which is not in use (besides the just sent patch for "make test"). Proposal: use $(TARG) instead of hard-coding "re".
ok
The current CFLAGS, uses -ggdb. I propose to change it to -ggdb3. This adds the ability to print and show macros:
Fairly speaking, I don't know why it's not just -g, and using that would be my preference (any other option may not work with a particular toolchain).
Explicitly specify the debug CFLAGS (commented out): #CFLAGS=-DDEBUG -ggdb3 -Wall -O0
I.e. you want add line like the above to Makefile? Sounds good.
Thanks!
This is my proposal for a new Makefile.
The added ASAN definitions are for finding memory related bugs (e.g. out of bounds, use after free(), memory leaks, etc.) and various undefined behavior. See the commit message for more info on what has been added.
For -fsanitize=undefined, a recent enough gcc is needed (found on all recent Linux distributions).