ScintillaOrg / lexilla

A library of language lexers for use with Scintilla
https://www.scintilla.org/Lexilla.html
Other
174 stars 62 forks source link

Compile with Control Flow Guard #88

Closed moshekaplan closed 2 years ago

moshekaplan commented 2 years ago

Please consider compiling Windows Lexilla with Control Flow Guard, to improve the security of other Windows projects that rely on Lexilla, like Notepad++.

For more information about Control Flow Guard, see: https://docs.microsoft.com/en-us/cpp/build/reference/guard-enable-control-flow-guard?view=msvc-170

For the Notepad++ ticket, see https://github.com/notepad-plus-plus/notepad-plus-plus/issues/8108

For the Scintilla feature request, see https://sourceforge.net/p/scintilla/feature-requests/1437

Attached is a diff with the changed needed, based on Lexilla 5.1.6: lexilla_cfg.diff.txt

nyamatongwe commented 2 years ago

As indicated on the Scintilla request 1437, this really should be checked for performance.

Lexing large files through a set of commonly used lexers (cpp, python, hypertext, sql, json, errorlist) and measuring execution time before/after making the change should reveal any major problems. This should be done with separate DLLs for Scintilla and Lexilla as that is the common scenario and does not allow optimizing away many of the checks.

moshekaplan commented 2 years ago

I have not done any performance testing. Do you already have a set of files or programs that you use for benchmarking?

nyamatongwe commented 2 years ago

Performance tests are commonly done on huge files (10MB to 4GB) that are impractical to include in the repository and commonly contain repeated instances of Scintilla files like scintilla/src/Editor.cxx. There are functional test example files in test/examples.

moshekaplan commented 2 years ago

I built lexilla and scintilla for Win x64 with and without CFG and ran a customized version of the performance test suite (performanceTests.py, only difference is larger loops) three times for each on my Windows 10 i5-2500k with AV disabled:

non-CFG Run 1:

C:\Users\moshe\Desktop\cfg_testing\scintilla\test>python3 performanceTests.py
..\..\lexilla\bin\Lexilla.DLL
Found Lexilla
..\bin\Scintilla.DLL
performanceTests
 3.899 testAddLine
. 3.925 testAddLineMiddle
. 0.020 testHuge
. 3.620 testHugeInserts
. 7.211 testHugeReplace
. 0.125 testUTF8CaseSearches
. 6.236 testUTF8Searches
.
----------------------------------------------------------------------
Ran 7 tests in 25.264s

non-CFG Run 2:

C:\Users\moshe\Desktop\cfg_testing\scintilla\test>python3 performanceTests.py
..\..\lexilla\bin\Lexilla.DLL
Found Lexilla
..\bin\Scintilla.DLL
performanceTests
 4.235 testAddLine
. 4.004 testAddLineMiddle
. 0.019 testHuge
. 3.804 testHugeInserts
. 7.121 testHugeReplace
. 0.128 testUTF8CaseSearches
. 6.526 testUTF8Searches
.
----------------------------------------------------------------------
Ran 7 tests in 26.086s

non-CFG Run 3:

C:\Users\moshe\Desktop\cfg_testing\scintilla\test>python3 performanceTests.py
..\..\lexilla\bin\Lexilla.DLL
Found Lexilla
..\bin\Scintilla.DLL
performanceTests
 3.953 testAddLine
. 3.829 testAddLineMiddle
. 0.019 testHuge
. 3.592 testHugeInserts
. 7.166 testHugeReplace
. 0.124 testUTF8CaseSearches
. 6.333 testUTF8Searches
.
----------------------------------------------------------------------
Ran 7 tests in 25.246s

CFG Run 1:

C:\Users\moshe\Desktop\cfg_testing\scintilla\test>python3 performanceTests.py
..\..\lexilla\bin\Lexilla.DLL
Found Lexilla
..\bin\Scintilla.DLL
performanceTests
 3.941 testAddLine
. 3.854 testAddLineMiddle
. 0.016 testHuge
. 3.683 testHugeInserts
. 8.035 testHugeReplace
. 0.136 testUTF8CaseSearches
. 6.567 testUTF8Searches
.
----------------------------------------------------------------------
Ran 7 tests in 26.473s

CFG Run 2:

C:\Users\moshe\Desktop\cfg_testing\scintilla\test>python3 performanceTests.py
..\..\lexilla\bin\Lexilla.DLL
Found Lexilla
..\bin\Scintilla.DLL
performanceTests
 4.361 testAddLine
. 3.967 testAddLineMiddle
. 0.020 testHuge
. 3.817 testHugeInserts
. 7.807 testHugeReplace
. 0.139 testUTF8CaseSearches
. 6.419 testUTF8Searches
.
----------------------------------------------------------------------
Ran 7 tests in 26.782s

CFG Run 3:

C:\Users\moshe\Desktop\cfg_testing\scintilla\test>python3 performanceTests.py
..\..\lexilla\bin\Lexilla.DLL
Found Lexilla
..\bin\Scintilla.DLL
performanceTests
 3.878 testAddLine
. 3.828 testAddLineMiddle
. 0.017 testHuge
. 3.641 testHugeInserts
. 7.147 testHugeReplace
. 0.123 testUTF8CaseSearches
. 6.242 testUTF8Searches
.
----------------------------------------------------------------------
Ran 7 tests in 25.094s

The fastest non-CFG run was 25.246s and the fastest CFG run was 25.094s (why use min over average). I don't think CFG improved performance, rather that the two were similar enough that any differences were more based on noise than anything else.

I will try to take a look at TestLexers in the next few days.

nyamatongwe commented 2 years ago

The patches are based on customized versions of Lexilla and Scintilla, not on the release versions. The patches include context lines <RuntimeLibrary>MultiThreadedDebug</RuntimeLibrary> that aren't used in the repositories or any release. The <ControlFlowGuard>Guard</ControlFlowGuard> clause is only added to Win32 and x64 items and not to ARM64.

CFG is better behaved these days but here is an article about past problems that made it a bit scary.

moshekaplan commented 2 years ago

As you recommended, I created a file with scintilla/src/Editor.cxx copied 100 times (27.5MB) and 1,000 times (~275MB).

I then tested the timing by doing the following:

  1. Copying the compiled lexilla\src\x64\Release\Lexilla.dll to lexilla\bin
  2. Wrapping lexilla\test\TestLexers.cxx's main function with calls to high_resolution_clock::now()
  3. Running TestLexers.exe (C:\Users\moshe\Desktop\cfg_testing\lexilla\test\x64\Release\TestLexers.exe)

Note for testing, either both Lexilla and TestLexers were compiled with CFG, or neither were. Targets were tested on Win 10 version 21H2 (build 19044.1766). Antivirus was disabled for the duration of the testing.

To summarize: Win x86 CFG had ~15% overhead on the 27.5 MB test (16.398 vs 14.179 seconds) Win x86 CFG failed to complete the 275 MB test due to memory allocation errors. Win x64 CFG had ~25% overhead on the 27.5 MB test (13.044 vs 10.510 seconds) Win x64 CFG had ~15% overhead on the 275 MB test (124.365 vs 107.952 seconds)

These results seem consistent with what Microsoft wrote, that about 12% performance overhead for CFG is normal.

While this overhead is more than I'd like, this is far better than the previous problems and according to the author, was at least partially fixed on Windows 10 17134.

I'm still in favor of compiling with CFG, but it's ultimately your call.

Test notes below:

Test files were made with the following Python script:

orig_path = r"C:\Users\moshe\Desktop\cfg_testing\lexilla\test\examples\cpp\Editor.cxx"
test_path = r"C:\Users\moshe\Desktop\cfg_testing\lexilla\test\examples\cpp\Editor_1000.cxx"

editor_data = open(orig_path, 'rb').read()
with open(test_path, 'wb') as fh:
    for i in range(1000):
        fh.write(editor_data)
Win x86 tests:

I tested with a 27.5 MB file (`Editor.cxx` copied 100 times):
Max memory used during testing was about 200 MB

Five runs without CFG (measured in seconds):
14.179
14.472
14.493
14.712
14.789

Five runs with CFG:
16.398
16.665
16.694
16.805
16.870

This was about a 15% overhead.
Win x64 tests:

For the 275MB file:
Max memory used during testing was about 2 GB

Five runs without CFG (measured in seconds):
107.952
108.339
110.020
112.264
112.948

Five runs with CFG:
124.365
124.996
125.210
126.305
127.501

I then rested with a 27.5 MB file (`Editor.cxx` copied 100 times):
Max memory used during testing was about 200 MB

Five runs without CFG (measured in seconds):
10.510
10.732
10.938
10.949
11.100

Five runs with CFG:
13.044
13.205
13.244
13.291
13.435
nyamatongwe commented 2 years ago

Ouch! That is worse than I was hoping for. As well as taking longer, it also means that more energy is used which is an area of concern for the Notepad++ project.

Adding a cost to indirect calls affects C++ virtual methods which are key to making flexible software which can easily be extended. I'm unsure just how much benefit there is to hardening the code of Notepad++ when there are so many extension points (macros, plugins, UDL, configuration) that are open to direct exploitation if operating on untrusted input.

If CFG is added to build files then downstream projects that want better performance can disable it but to do that they have to know that CFG is on and has a cost.

nyamatongwe commented 2 years ago

The Notepad++ project has now rejected this change, so I am closing the issue here.

Its possible that hardening the Lexilla DLL may be worthwhile for other projects and this issue could be re-examined.

moshekaplan commented 2 years ago

Thanks for taking the time to seriously consider enabling CFG, I appreciate it. Hopefully CFG will be made more performant in a future update.

matbech commented 1 year ago

@moshekaplan It might be interesting to see new benchmarks with XFG enabled. The .vcxproj config to enable XFG:

<AdditionalOptions>/guard:xfg %(AdditionalOptions)</AdditionalOptions>

XFG is enabled for Windows 11 binaries and Microsoft recommends to use XFG for all internal projects.

References

  1. https://www.offensive-security.com/offsec/extended-flow-guard/
  2. https://blog.quarkslab.com/how-the-msvc-compiler-generates-xfg-function-prototype-hashes.html
  3. https://github.com/nccgroup/exploit_mitigations/blob/master/windows_mitigations.md