skvadrik / re2c

Lexer generator for C, C++, Go and Rust.
https://re2c.org
Other
1.07k stars 169 forks source link

Some skeleton tests are extremely slow on Windows #331

Open sergeyklay opened 3 years ago

sergeyklay commented 3 years ago

Hello,

I'm currently developing a native PowerShell wrapper for running re2c tests on Windows without resorting to Mingw, Cygwin and so on. I managed to write multi-threaded tests runner using PowerShell only (right now for skeleton tests only).

I run tests on my 20 cores Xeon in parallel and everything is going well except for some tests. They are super slow. After a dig into, I realized that these tests are extremely slow due to the generation process. Below I provide measurements of some generations:

6.8190486 seconds

& C:\src\re2c\cmake-build-debug-visual-studio-x64\Debug\re2c.exe  `
    "debug\closure_stats_gtop.re" `
    -o "debug\closure_stats_gtop.c" `
    -i `
    --posix-captures `
    --posix-closure gor1 `
    --dump-closure-stats `
    --fixed-tags toplevel `
    -W `
    --no-version `
    --no-generation-date `
    --skeleton `
    -Werror-undefined-control-flow

9.3340122 seconds

C:\src\re2c\cmake-build-debug-visual-studio-x64\Debug\re2c.exe  `
    "encodings\class3.re" `
    -o "encodings\class3.c" `
    -i8 `
    -W `
    --no-version `
    --no-generation-date `
    --skeleton `
    -Werror-undefined-control-flow

7.642483 seconds

C:\src\re2c\cmake-build-debug-visual-studio-x64\Debug\re2c.exe  `
    "debug\closure_stats_gor1.re" `
    -o "debug\closure_stats_gor1.c" `
    -i `
    --posix-captures `
    --posix-closure gtop `
    --dump-closure-stats 
    --fixed-tags toplevel `
    -W `
    --no-version `
    --no-generation-date `
    --skeleton `
    -Werror-undefined-control-flow

13.0525341 seconds:

& C:\src\re2c\cmake-build-debug-visual-studio-x64\Debug\re2c.exe  `
    "bug1708378.re" `
    -o "bug1708378.c" `
    -ib `
    -W `
    --no-version `
    --no-generation-date `
    --skeleton `
    -Werror-undefined-control-flow

265.9861263 seconds: 🎉

& C:\src\re2c\cmake-build-debug-visual-studio-x64\Debug\re2c.exe  `
    "bug128.re" `
    -o "bug128.c"  `
    -W `
    --no-version `
    --no-generation-date `
    --skeleton `
    -Werror-undefined-control-flow

Actually this is not a complete list. Some tests never finished. According to my quick look, slow tests are about 9%. The rest of the tests are as fast as, for example, in macOs or Linux. re2c using built with the following configuration

Manual run shows the same degradation:

PS C:\src\re2c\test_201023160030> Measure-Command { & C:\src\re2c\cmake-build-debug-visual-studio-x64\Debug\re2c.exe `
>> bug1708378.re `
>> -o bug1708378.c `
>> -ib `
>> -W `
>> --no-version `
>> --no-generation-date `
>> --skeleton `
>> -Werror-undefined-control-flow }

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 12
Milliseconds      : 923
Ticks             : 129239968
TotalDays         : 0.000149583296296296
TotalHours        : 0.00358999911111111
TotalMinutes      : 0.215399946666667
TotalSeconds      : 12.9239968
TotalMilliseconds : 12923.9968

I will publish additional comments during my research.

sergeyklay commented 3 years ago

Total time 240.175295 is 4 minutes, and it's the time required to run skeleton tests in Release Visual Studio build, right? It seems fast enough for CI. What is the total time for Visual Studio Debug build and for Mingw build?

I'm still working on this. Not all things are ready yet. But I'll provide results ASAP.

It might be worth reporting the Visual Studio problem with the infinite loop in Debug build.

Yeah. But TBH, I don't fully understand the nature of the issue and can't provide a minimal PoC

skvadrik commented 3 years ago

can't provide a minimal PoC

Minimal PoC will require effort, an I'm reluctant to spend the effort because I'm not sure the bug hasn't been fixed in newer or non-free versions. But I can provide exact instructions on reproducing this in Cygwin environment (starting with checking out re2c from git an on to running the hanging test). I'll have a look if there is an easy way to send such a bug report.

sergeyklay commented 3 years ago

Please keep us informed on this issue.

sergeyklay commented 3 years ago

@skvadrik I'm sorry for the silence and lack of any activity in this direction. Increased activity in my main job due to the end of the year. In any case I'm always here :)

skvadrik commented 3 years ago

@sergeyklay It's perfectly fine, I have the same problem with my day job. ;)

I haven't merged anything into master for weeks. In part this is because I have little time, and in part because I'm working on a local experiment. And I assume that you also have good reasons. Thanks for letting me know anyway, much appreciated!

sergeyklay commented 3 years ago

@skvadrik The work has not been completed yet and some things have yet to be implemented, but general PoC looks like: https://github.com/sergeyklay/re2c/blob/feature/powershell-test-runner/run_tests.ps1

I'm not a Windows user, although I have a PC running this system. And looking back at the whole PowerShell-tests-runner-journey, I understand that I would not want to become the main maintainer of this solution. The issue from strategic point of view is, in the long run, I would not want to support this solution. Therefore, I would like to propose a universal solution that works equally well in all major systems - Python. What do you think about a test runner written in Python for Linux/UNIX as well as Windows systems? Would this be an acceptable for re2c project?

skvadrik commented 3 years ago

@sergeyklay That's a lot of work! Of course it's understandable if you wouldn't want to maintain it in the long run. I myself run Windows in a VM, and it is extremely slow and inconvenient. But the main disadvantage it the necessity to maintain different scripts on different platforms, they would diverge over time.

What do you think about a test runner written in Python for Linux/UNIX as well as Windows systems? Would this be an acceptable for re2c project?

Certainly, I think it is the best option. run_tests.sh is too complex for a bash script, and most of the time is spent on the test harness, not on running the tests themselves. And it's hard to maintain portability.

sergeyklay commented 3 years ago

Fine! I'll start work on unified solution then!

skvadrik commented 3 years ago

Great! \o/

sergeyklay commented 3 years ago

Hello,

A small update about re2c test runner in Python. The work has not been completed yet, but regular tests are already passing:


asciicast

sergeyklay commented 3 years ago

Skeleton tests are a little slow, but and they already pass too:


asciicast


Run as time -v nice python run_tests.py --skeleton

Command being timed: "nice python run_tests.py --skeleton"
User time (seconds): 62.33
System time (seconds): 27.75
Percent of CPU this job got: 44%
Elapsed (wall clock) time (h:mm:ss or m:ss): 3:24.52
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 378976
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 1674
Minor (reclaiming a frame) page faults: 7844314
Voluntary context switches: 19387
Involuntary context switches: 97620
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 9
Page size (bytes): 16384
Exit status: 0
skvadrik commented 3 years ago

Awesome, thanks for all your work! \o/ (And I like this asciinema tool.)

skvadrik commented 1 year ago

A few notes from my latest unsuccessful attempt to enable skeleton tests on windows (and check if they are still too slow):