Closed lnicola closed 7 years ago
Please submit a pull request if you think you may improve things. I would not want buffer security checks myself.
Note that as of Visual Studio 2017, the correct approach would be to provide a CMake file.
I haven't checked the generated code, but I'll send a pull request if /GS-
was intended.
It's nice that VS 2017 has CMake support, although I haven't checked it.
Why wouldn't you have buffer security check? It's a good way to know, whether your stack is corrupted, and if so, it will shut down the program.
@PeterBocan It doesn't look like simdcomp
uses too may stack buffers. As such, it's probably the caller's job to check the stack cookies.
I submitted this because I was looking into integrating simdcomp
with another build system and I wanted to replicate the Makefile
settings as a starting point. /GS
caught my eye, as the rest of the compiler options seem geared towards aggressive optimizations.
Ah, I see. It doesn't have to be a only stack buffer overflow, but any stack frame manipulation.
The purpose of /GS
is pushing a magic value onto stack and once you try to return from that function, that given value is checked, and if the mismatch occurs, then it is shut down by the processor.
It's a (another) good layer of security measure against executing malicious code on the stack. I think it's worth to have enabled it.
Sure, but look at the code: https://github.com/lemire/simdcomp/blob/master/src/simdfor.c
I don't think there's much potential for stack overflows there. For comparison, here are the things /GS
applies to: https://msdn.microsoft.com/en-us/library/8dbf701c.aspx#Anchor_3
@lnicola probably a more convenient way were to introduce a config flag, like passing DISABLE_SECURITY_FLAGS=1 to makefile. Then you'd be easy to integrate the build elsewhere. Passing security flags is a common good practice, especially on Windows, you know. Indeed, some security cflags are even missing, that are available in newer VS versions.
@lemire maybe a step further could be to even move to cmake completely? Probably no big diff for autotools, as there are no deps to check. But we could fe automate dso name creation, etc. Otherwise, vs2017 is fine as editor by simply moving the folder into the project pane. Otherwise, as a vim user, i'd probably not care much :)
Thanks.
Okay, so I checked. As I suspected, /GS
doesn't matter much here:
File | /GS size |
/GS- size |
Difference |
---|---|---|---|
avxbitpacking.o | 3 758 | 3 754 | -4 |
simdbitpacking.o | 669 183 | 669 175 | -8 |
simdcomputil.o | 15 685 | 15 679 | -6 |
simdfor.o | 618 379 | 618 371 | -8 |
simdintegratedbitpacking.o | 1 511 188 | 1 511 184 | -4 |
simdpackedsearch.o | 3 764 | 3 760 | -4 |
simdpackedselect.o | 3 764 | 3 760 | -4 |
4 bytes of difference can be explained by the -GS<NUL>
string showing up in the object files. I don't know where the other 2-4 bytes come from, but I think this is a moot point.
@weltling
maybe a step further could be to even move to cmake completely?
Yes. But given that we already have something that works well, it is not urgent.
@lnicola
Any impact on the performance?
@lemire I haven't tried, but I suspect it makes no difference to the generated code.
@lnicola So you recommend closing this issue, then?
I am for closing that. The performance hit is negligible. As I said, it's only a few assembly instructions.
Please disregard my comment about the object size, it's not valid since I was using /GL
. I'll post as soon as I'm able to build the benchmark.
To be honest, the run to run variation seemed far too large for these numbers to mean anything. In any case, I'm fine with closing this.
/GS
benchmarking search
bit width = 0, fast search function time = 21, naive time = 39 , fast with length time = 7
bit width = 1, fast search function time = 67, naive time = 127 , fast with length time = 73
bit width = 2, fast search function time = 70, naive time = 85 , fast with length time = 106
bit width = 3, fast search function time = 54, naive time = 174 , fast with length time = 74
bit width = 4, fast search function time = 88, naive time = 160 , fast with length time = 74
bit width = 5, fast search function time = 74, naive time = 207 , fast with length time = 62
bit width = 6, fast search function time = 57, naive time = 101 , fast with length time = 62
bit width = 7, fast search function time = 100, naive time = 100 , fast with length time = 87
bit width = 8, fast search function time = 57, naive time = 95 , fast with length time = 82
bit width = 9, fast search function time = 80, naive time = 121 , fast with length time = 61
bit width = 10, fast search function time = 79, naive time = 140 , fast with length time = 81
bit width = 11, fast search function time = 95, naive time = 119 , fast with length time = 81
bit width = 12, fast search function time = 55, naive time = 123 , fast with length time = 61
bit width = 13, fast search function time = 55, naive time = 124 , fast with length time = 80
bit width = 14, fast search function time = 97, naive time = 134 , fast with length time = 60
bit width = 15, fast search function time = 75, naive time = 119 , fast with length time = 78
bit width = 16, fast search function time = 88, naive time = 113 , fast with length time = 59
bit width = 17, fast search function time = 121, naive time = 141 , fast with length time = 81
bit width = 18, fast search function time = 56, naive time = 102 , fast with length time = 101
bit width = 19, fast search function time = 99, naive time = 124 , fast with length time = 78
bit width = 20, fast search function time = 118, naive time = 143 , fast with length time = 105
bit width = 21, fast search function time = 92, naive time = 103 , fast with length time = 93
bit width = 22, fast search function time = 82, naive time = 103 , fast with length time = 99
bit width = 23, fast search function time = 58, naive time = 106 , fast with length time = 63
bit width = 24, fast search function time = 78, naive time = 173 , fast with length time = 85
bit width = 25, fast search function time = 75, naive time = 127 , fast with length time = 61
bit width = 26, fast search function time = 75, naive time = 114 , fast with length time = 64
bit width = 27, fast search function time = 58, naive time = 127 , fast with length time = 101
bit width = 28, fast search function time = 58, naive time = 123 , fast with length time = 94
bit width = 29, fast search function time = 80, naive time = 127 , fast with length time = 116
bit width = 30, fast search function time = 81, naive time = 175 , fast with length time = 92
bit width = 31, fast search function time = 58, naive time = 124 , fast with length time = 86
bit width = 32, fast search function time = 73, naive time = 108 , fast with length time = 50
benchmarking select
bit width = 0, fast select function time = 32, naive time = 15
bit width = 1, fast select function time = 62, naive time = 44
bit width = 2, fast select function time = 83, naive time = 63
bit width = 3, fast select function time = 38, naive time = 43
bit width = 4, fast select function time = 76, naive time = 59
bit width = 5, fast select function time = 63, naive time = 62
bit width = 6, fast select function time = 92, naive time = 61
bit width = 7, fast select function time = 62, naive time = 62
bit width = 8, fast select function time = 56, naive time = 58
bit width = 9, fast select function time = 59, naive time = 79
bit width = 10, fast select function time = 57, naive time = 79
bit width = 11, fast select function time = 97, naive time = 79
bit width = 12, fast select function time = 72, naive time = 45
bit width = 13, fast select function time = 32, naive time = 49
bit width = 14, fast select function time = 94, naive time = 49
bit width = 15, fast select function time = 51, naive time = 50
bit width = 16, fast select function time = 71, naive time = 57
bit width = 17, fast select function time = 56, naive time = 49
bit width = 18, fast select function time = 60, naive time = 51
bit width = 19, fast select function time = 60, naive time = 67
bit width = 20, fast select function time = 78, naive time = 81
bit width = 21, fast select function time = 80, naive time = 82
bit width = 22, fast select function time = 79, naive time = 82
bit width = 23, fast select function time = 78, naive time = 84
bit width = 24, fast select function time = 76, naive time = 63
bit width = 25, fast select function time = 79, naive time = 68
bit width = 26, fast select function time = 80, naive time = 67
bit width = 27, fast select function time = 81, naive time = 85
bit width = 28, fast select function time = 118, naive time = 71
bit width = 29, fast select function time = 46, naive time = 54
bit width = 30, fast select function time = 64, naive time = 52
bit width = 31, fast select function time = 58, naive time = 53
bit width = 32, fast select function time = 3, naive time = 60
/GS-
benchmarking search
bit width = 0, fast search function time = 5, naive time = 46 , fast with length time = 6
bit width = 1, fast search function time = 80, naive time = 95 , fast with length time = 111
bit width = 2, fast search function time = 54, naive time = 149 , fast with length time = 141
bit width = 3, fast search function time = 81, naive time = 164 , fast with length time = 75
bit width = 4, fast search function time = 83, naive time = 107 , fast with length time = 113
bit width = 5, fast search function time = 94, naive time = 174 , fast with length time = 96
bit width = 6, fast search function time = 71, naive time = 131 , fast with length time = 77
bit width = 7, fast search function time = 99, naive time = 106 , fast with length time = 81
bit width = 8, fast search function time = 73, naive time = 143 , fast with length time = 62
bit width = 9, fast search function time = 81, naive time = 140 , fast with length time = 93
bit width = 10, fast search function time = 71, naive time = 139 , fast with length time = 100
bit width = 11, fast search function time = 109, naive time = 189 , fast with length time = 115
bit width = 12, fast search function time = 74, naive time = 138 , fast with length time = 75
bit width = 13, fast search function time = 54, naive time = 137 , fast with length time = 94
bit width = 14, fast search function time = 56, naive time = 141 , fast with length time = 99
bit width = 15, fast search function time = 81, naive time = 143 , fast with length time = 103
bit width = 16, fast search function time = 76, naive time = 117 , fast with length time = 61
bit width = 17, fast search function time = 76, naive time = 107 , fast with length time = 62
bit width = 18, fast search function time = 58, naive time = 106 , fast with length time = 62
bit width = 19, fast search function time = 58, naive time = 105 , fast with length time = 91
bit width = 20, fast search function time = 57, naive time = 124 , fast with length time = 61
bit width = 21, fast search function time = 57, naive time = 121 , fast with length time = 62
bit width = 22, fast search function time = 59, naive time = 120 , fast with length time = 60
bit width = 23, fast search function time = 58, naive time = 111 , fast with length time = 61
bit width = 24, fast search function time = 58, naive time = 105 , fast with length time = 62
bit width = 25, fast search function time = 57, naive time = 107 , fast with length time = 62
bit width = 26, fast search function time = 58, naive time = 112 , fast with length time = 61
bit width = 27, fast search function time = 58, naive time = 143 , fast with length time = 61
bit width = 28, fast search function time = 57, naive time = 110 , fast with length time = 60
bit width = 29, fast search function time = 59, naive time = 113 , fast with length time = 62
bit width = 30, fast search function time = 58, naive time = 113 , fast with length time = 60
bit width = 31, fast search function time = 58, naive time = 114 , fast with length time = 61
bit width = 32, fast search function time = 47, naive time = 92 , fast with length time = 53
benchmarking select
bit width = 0, fast select function time = 5, naive time = 16
bit width = 1, fast select function time = 41, naive time = 44
bit width = 2, fast select function time = 63, naive time = 44
bit width = 3, fast select function time = 78, naive time = 50
bit width = 4, fast select function time = 85, naive time = 46
bit width = 5, fast select function time = 59, naive time = 46
bit width = 6, fast select function time = 89, naive time = 46
bit width = 7, fast select function time = 38, naive time = 46
bit width = 8, fast select function time = 70, naive time = 43
bit width = 9, fast select function time = 57, naive time = 47
bit width = 10, fast select function time = 58, naive time = 47
bit width = 11, fast select function time = 39, naive time = 48
bit width = 12, fast select function time = 81, naive time = 46
bit width = 13, fast select function time = 78, naive time = 47
bit width = 14, fast select function time = 81, naive time = 48
bit width = 15, fast select function time = 54, naive time = 49
bit width = 16, fast select function time = 52, naive time = 42
bit width = 17, fast select function time = 35, naive time = 49
bit width = 18, fast select function time = 79, naive time = 49
bit width = 19, fast select function time = 84, naive time = 50
bit width = 20, fast select function time = 40, naive time = 49
bit width = 21, fast select function time = 87, naive time = 51
bit width = 22, fast select function time = 90, naive time = 51
bit width = 23, fast select function time = 89, naive time = 52
bit width = 24, fast select function time = 57, naive time = 48
bit width = 25, fast select function time = 69, naive time = 52
bit width = 26, fast select function time = 64, naive time = 53
bit width = 27, fast select function time = 64, naive time = 53
bit width = 28, fast select function time = 61, naive time = 91
bit width = 29, fast select function time = 94, naive time = 53
bit width = 30, fast select function time = 42, naive time = 76
bit width = 31, fast select function time = 42, naive time = 54
bit width = 32, fast select function time = 4, naive time = 37
Closing per request of the OP.
The Windows
makefile
enables/GS
(buffer security checks) for release builds. I won't argue in favor of security over performance or the other way around, but is this intentional or should it be/GS-
?