lemire / simdcomp

A simple C library for compressing lists of integers using binary packing
BSD 3-Clause "New" or "Revised" License
488 stars 53 forks source link

/GS for release Windows builds #18

Closed lnicola closed 7 years ago

lnicola commented 7 years ago

The Windows makefile enables /GS (buffer security checks) for release builds. I won't argue in favor of security over performance or the other way around, but is this intentional or should it be /GS-?

lemire commented 7 years ago

Please submit a pull request if you think you may improve things. I would not want buffer security checks myself.

Note that as of Visual Studio 2017, the correct approach would be to provide a CMake file.

lnicola commented 7 years ago

I haven't checked the generated code, but I'll send a pull request if /GS- was intended.

It's nice that VS 2017 has CMake support, although I haven't checked it.

PeterBocan commented 7 years ago

Why wouldn't you have buffer security check? It's a good way to know, whether your stack is corrupted, and if so, it will shut down the program.

lnicola commented 7 years ago

@PeterBocan It doesn't look like simdcomp uses too may stack buffers. As such, it's probably the caller's job to check the stack cookies.

I submitted this because I was looking into integrating simdcomp with another build system and I wanted to replicate the Makefile settings as a starting point. /GS caught my eye, as the rest of the compiler options seem geared towards aggressive optimizations.

PeterBocan commented 7 years ago

Ah, I see. It doesn't have to be a only stack buffer overflow, but any stack frame manipulation.

The purpose of /GS is pushing a magic value onto stack and once you try to return from that function, that given value is checked, and if the mismatch occurs, then it is shut down by the processor.

It's a (another) good layer of security measure against executing malicious code on the stack. I think it's worth to have enabled it.

lnicola commented 7 years ago

Sure, but look at the code: https://github.com/lemire/simdcomp/blob/master/src/simdfor.c

I don't think there's much potential for stack overflows there. For comparison, here are the things /GS applies to: https://msdn.microsoft.com/en-us/library/8dbf701c.aspx#Anchor_3

weltling commented 7 years ago

@lnicola probably a more convenient way were to introduce a config flag, like passing DISABLE_SECURITY_FLAGS=1 to makefile. Then you'd be easy to integrate the build elsewhere. Passing security flags is a common good practice, especially on Windows, you know. Indeed, some security cflags are even missing, that are available in newer VS versions.

@lemire maybe a step further could be to even move to cmake completely? Probably no big diff for autotools, as there are no deps to check. But we could fe automate dso name creation, etc. Otherwise, vs2017 is fine as editor by simply moving the folder into the project pane. Otherwise, as a vim user, i'd probably not care much :)

Thanks.

lnicola commented 7 years ago

Okay, so I checked. As I suspected, /GS doesn't matter much here:

File /GS size /GS- size Difference
avxbitpacking.o 3 758 3 754 -4
simdbitpacking.o 669 183 669 175 -8
simdcomputil.o 15 685 15 679 -6
simdfor.o 618 379 618 371 -8
simdintegratedbitpacking.o 1 511 188 1 511 184 -4
simdpackedsearch.o 3 764 3 760 -4
simdpackedselect.o 3 764 3 760 -4

4 bytes of difference can be explained by the -GS<NUL> string showing up in the object files. I don't know where the other 2-4 bytes come from, but I think this is a moot point.

lemire commented 7 years ago

@weltling

maybe a step further could be to even move to cmake completely?

Yes. But given that we already have something that works well, it is not urgent.

lemire commented 7 years ago

@lnicola

Any impact on the performance?

lnicola commented 7 years ago

@lemire I haven't tried, but I suspect it makes no difference to the generated code.

lemire commented 7 years ago

@lnicola So you recommend closing this issue, then?

PeterBocan commented 7 years ago

I am for closing that. The performance hit is negligible. As I said, it's only a few assembly instructions.

lnicola commented 7 years ago

Please disregard my comment about the object size, it's not valid since I was using /GL. I'll post as soon as I'm able to build the benchmark.

lnicola commented 7 years ago

To be honest, the run to run variation seemed far too large for these numbers to mean anything. In any case, I'm fine with closing this.

/GS
benchmarking search
bit width = 0, fast search function time = 21, naive time = 39 , fast with length time = 7
bit width = 1, fast search function time = 67, naive time = 127 , fast with length time = 73
bit width = 2, fast search function time = 70, naive time = 85 , fast with length time = 106
bit width = 3, fast search function time = 54, naive time = 174 , fast with length time = 74
bit width = 4, fast search function time = 88, naive time = 160 , fast with length time = 74
bit width = 5, fast search function time = 74, naive time = 207 , fast with length time = 62
bit width = 6, fast search function time = 57, naive time = 101 , fast with length time = 62
bit width = 7, fast search function time = 100, naive time = 100 , fast with length time = 87
bit width = 8, fast search function time = 57, naive time = 95 , fast with length time = 82
bit width = 9, fast search function time = 80, naive time = 121 , fast with length time = 61
bit width = 10, fast search function time = 79, naive time = 140 , fast with length time = 81
bit width = 11, fast search function time = 95, naive time = 119 , fast with length time = 81
bit width = 12, fast search function time = 55, naive time = 123 , fast with length time = 61
bit width = 13, fast search function time = 55, naive time = 124 , fast with length time = 80
bit width = 14, fast search function time = 97, naive time = 134 , fast with length time = 60
bit width = 15, fast search function time = 75, naive time = 119 , fast with length time = 78
bit width = 16, fast search function time = 88, naive time = 113 , fast with length time = 59
bit width = 17, fast search function time = 121, naive time = 141 , fast with length time = 81
bit width = 18, fast search function time = 56, naive time = 102 , fast with length time = 101
bit width = 19, fast search function time = 99, naive time = 124 , fast with length time = 78
bit width = 20, fast search function time = 118, naive time = 143 , fast with length time = 105
bit width = 21, fast search function time = 92, naive time = 103 , fast with length time = 93
bit width = 22, fast search function time = 82, naive time = 103 , fast with length time = 99
bit width = 23, fast search function time = 58, naive time = 106 , fast with length time = 63
bit width = 24, fast search function time = 78, naive time = 173 , fast with length time = 85
bit width = 25, fast search function time = 75, naive time = 127 , fast with length time = 61
bit width = 26, fast search function time = 75, naive time = 114 , fast with length time = 64
bit width = 27, fast search function time = 58, naive time = 127 , fast with length time = 101
bit width = 28, fast search function time = 58, naive time = 123 , fast with length time = 94
bit width = 29, fast search function time = 80, naive time = 127 , fast with length time = 116
bit width = 30, fast search function time = 81, naive time = 175 , fast with length time = 92
bit width = 31, fast search function time = 58, naive time = 124 , fast with length time = 86
bit width = 32, fast search function time = 73, naive time = 108 , fast with length time = 50
benchmarking select
bit width = 0, fast select function time = 32, naive time = 15
bit width = 1, fast select function time = 62, naive time = 44
bit width = 2, fast select function time = 83, naive time = 63
bit width = 3, fast select function time = 38, naive time = 43
bit width = 4, fast select function time = 76, naive time = 59
bit width = 5, fast select function time = 63, naive time = 62
bit width = 6, fast select function time = 92, naive time = 61
bit width = 7, fast select function time = 62, naive time = 62
bit width = 8, fast select function time = 56, naive time = 58
bit width = 9, fast select function time = 59, naive time = 79
bit width = 10, fast select function time = 57, naive time = 79
bit width = 11, fast select function time = 97, naive time = 79
bit width = 12, fast select function time = 72, naive time = 45
bit width = 13, fast select function time = 32, naive time = 49
bit width = 14, fast select function time = 94, naive time = 49
bit width = 15, fast select function time = 51, naive time = 50
bit width = 16, fast select function time = 71, naive time = 57
bit width = 17, fast select function time = 56, naive time = 49
bit width = 18, fast select function time = 60, naive time = 51
bit width = 19, fast select function time = 60, naive time = 67
bit width = 20, fast select function time = 78, naive time = 81
bit width = 21, fast select function time = 80, naive time = 82
bit width = 22, fast select function time = 79, naive time = 82
bit width = 23, fast select function time = 78, naive time = 84
bit width = 24, fast select function time = 76, naive time = 63
bit width = 25, fast select function time = 79, naive time = 68
bit width = 26, fast select function time = 80, naive time = 67
bit width = 27, fast select function time = 81, naive time = 85
bit width = 28, fast select function time = 118, naive time = 71
bit width = 29, fast select function time = 46, naive time = 54
bit width = 30, fast select function time = 64, naive time = 52
bit width = 31, fast select function time = 58, naive time = 53
bit width = 32, fast select function time = 3, naive time = 60

/GS-
benchmarking search
bit width = 0, fast search function time = 5, naive time = 46 , fast with length time = 6
bit width = 1, fast search function time = 80, naive time = 95 , fast with length time = 111
bit width = 2, fast search function time = 54, naive time = 149 , fast with length time = 141
bit width = 3, fast search function time = 81, naive time = 164 , fast with length time = 75
bit width = 4, fast search function time = 83, naive time = 107 , fast with length time = 113
bit width = 5, fast search function time = 94, naive time = 174 , fast with length time = 96
bit width = 6, fast search function time = 71, naive time = 131 , fast with length time = 77
bit width = 7, fast search function time = 99, naive time = 106 , fast with length time = 81
bit width = 8, fast search function time = 73, naive time = 143 , fast with length time = 62
bit width = 9, fast search function time = 81, naive time = 140 , fast with length time = 93
bit width = 10, fast search function time = 71, naive time = 139 , fast with length time = 100
bit width = 11, fast search function time = 109, naive time = 189 , fast with length time = 115
bit width = 12, fast search function time = 74, naive time = 138 , fast with length time = 75
bit width = 13, fast search function time = 54, naive time = 137 , fast with length time = 94
bit width = 14, fast search function time = 56, naive time = 141 , fast with length time = 99
bit width = 15, fast search function time = 81, naive time = 143 , fast with length time = 103
bit width = 16, fast search function time = 76, naive time = 117 , fast with length time = 61
bit width = 17, fast search function time = 76, naive time = 107 , fast with length time = 62
bit width = 18, fast search function time = 58, naive time = 106 , fast with length time = 62
bit width = 19, fast search function time = 58, naive time = 105 , fast with length time = 91
bit width = 20, fast search function time = 57, naive time = 124 , fast with length time = 61
bit width = 21, fast search function time = 57, naive time = 121 , fast with length time = 62
bit width = 22, fast search function time = 59, naive time = 120 , fast with length time = 60
bit width = 23, fast search function time = 58, naive time = 111 , fast with length time = 61
bit width = 24, fast search function time = 58, naive time = 105 , fast with length time = 62
bit width = 25, fast search function time = 57, naive time = 107 , fast with length time = 62
bit width = 26, fast search function time = 58, naive time = 112 , fast with length time = 61
bit width = 27, fast search function time = 58, naive time = 143 , fast with length time = 61
bit width = 28, fast search function time = 57, naive time = 110 , fast with length time = 60
bit width = 29, fast search function time = 59, naive time = 113 , fast with length time = 62
bit width = 30, fast search function time = 58, naive time = 113 , fast with length time = 60
bit width = 31, fast search function time = 58, naive time = 114 , fast with length time = 61
bit width = 32, fast search function time = 47, naive time = 92 , fast with length time = 53
benchmarking select
bit width = 0, fast select function time = 5, naive time = 16
bit width = 1, fast select function time = 41, naive time = 44
bit width = 2, fast select function time = 63, naive time = 44
bit width = 3, fast select function time = 78, naive time = 50
bit width = 4, fast select function time = 85, naive time = 46
bit width = 5, fast select function time = 59, naive time = 46
bit width = 6, fast select function time = 89, naive time = 46
bit width = 7, fast select function time = 38, naive time = 46
bit width = 8, fast select function time = 70, naive time = 43
bit width = 9, fast select function time = 57, naive time = 47
bit width = 10, fast select function time = 58, naive time = 47
bit width = 11, fast select function time = 39, naive time = 48
bit width = 12, fast select function time = 81, naive time = 46
bit width = 13, fast select function time = 78, naive time = 47
bit width = 14, fast select function time = 81, naive time = 48
bit width = 15, fast select function time = 54, naive time = 49
bit width = 16, fast select function time = 52, naive time = 42
bit width = 17, fast select function time = 35, naive time = 49
bit width = 18, fast select function time = 79, naive time = 49
bit width = 19, fast select function time = 84, naive time = 50
bit width = 20, fast select function time = 40, naive time = 49
bit width = 21, fast select function time = 87, naive time = 51
bit width = 22, fast select function time = 90, naive time = 51
bit width = 23, fast select function time = 89, naive time = 52
bit width = 24, fast select function time = 57, naive time = 48
bit width = 25, fast select function time = 69, naive time = 52
bit width = 26, fast select function time = 64, naive time = 53
bit width = 27, fast select function time = 64, naive time = 53
bit width = 28, fast select function time = 61, naive time = 91
bit width = 29, fast select function time = 94, naive time = 53
bit width = 30, fast select function time = 42, naive time = 76
bit width = 31, fast select function time = 42, naive time = 54
bit width = 32, fast select function time = 4, naive time = 37
lemire commented 7 years ago

Closing per request of the OP.