dlang-community / libdparse

Library for lexing and parsing D source code
https://libdparse.dlang.io
Boost Software License 1.0
115 stars 57 forks source link

Revise x86_64 inline asm: Fix for LDC v1.29+, and enable on Windows and with GDC too #458

Closed kinke closed 2 years ago

kinke commented 2 years ago

Resolves #457.

kinke commented 2 years ago

The GDC/GCC-style inline asm syntax is also inline-able with LDC, unlike naked DMD-style one. IIRC, LDC v1.22 added support for that syntax.

WebFreak001 commented 2 years ago

@dd86k can you review this?

codecov[bot] commented 2 years ago

Codecov Report

Merging #458 (5ed52e3) into master (b772900) will increase coverage by 0.01%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #458      +/-   ##
==========================================
+ Coverage   82.85%   82.86%   +0.01%     
==========================================
  Files          11       11              
  Lines        8315     8321       +6     
==========================================
+ Hits         6889     6895       +6     
  Misses       1426     1426              
Impacted Files Coverage Δ
src/dparse/lexer.d 85.67% <100.00%> (+0.08%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update b772900...5ed52e3. Read the comment docs.

kinke commented 2 years ago

FWIW, here's the almost-identical GCC-style one for C++, supported by gcc and clang: godbolt

dd86k commented 2 years ago

@dd86k can you review this?

@WebFreak001: I don't mind testing this, but what would I be able to test, or should I expect from this PR, precisely?

kinke commented 2 years ago

The unittest runner (dub test) would fail for me without this change with LDC v1.29 on Linux (#457, obviously with a CPU with SSE 4.2), now works. I haven't tested it on Windows, nor with GDC.

dd86k commented 2 years ago

To resume: Run tests with this PR on a Linux system with SSE4.2 using LDC 1.29.0 and GDC? Optionally on Windows?

kinke commented 2 years ago

Yep, that should cover everything I think. - Note that LDC versions < 1.22 couldn't be used anymore to compile libdparse; that would require some more static if + enum business to use the DMD-style syntax for those.

dd86k commented 2 years ago

Cloned https://github.com/kinke/libdparse asm onto my machine.

Commit: 0.19.1+commit.1.g378e41e

Tests using dub test --compiler=ldc2 and dub test --compiler=gdc respectively:

Didn't test GDC on Windows because last time I tried using GDC on Windows last year, it couldn't compile D code at all.

Hope this answers it for you! I'll be here if you need anything else.

Processor info

Linux (host):

Name:        AuthenticAMD AMD Ryzen 9 5950X 16-Core Processor            
Identifier:  Family 0x19 Model 0x21 Stepping 0x0
Cores:       16 cores, 32 threads
Max. Memory: 256 TiB physical, 256 TiB virtual
Baseline:    x86-64-v3
Techs:       core-performance-boost htt
Extensions:  x87/fpu +f16c mmx extmmx amd64/x86-64 +lahf64 amd-v/vmx +svm=v1 aes-ni adx sha bmi1 bmi2
SSE:         sse sse2 sse3 ssse3 sse4.2 sse4a
AVX:         avx avx2
AMX:         None
Mitigations: ibpb ibrs ibrs_pref stibp stibp_on ssbd
Cache L1-D:   16x   32 KiB,  512 KiB total, si
Cache L1-I:   16x   32 KiB,  512 KiB total, si
Cache L2-U:   16x  512 KiB,    8 MiB total, si ci
Cache L3-U:    2x   32 MiB,   64 MiB total, si nwbv

Windows 10 (guest, should be same for 11):

Name:        AuthenticAMD AMD Ryzen 9 5950X 16-Core Processor
Identifier:  Family 0x19 Model 0x21 Stepping 0x0
Cores:       2 cores, 4 threads
Max. Memory: 256 TiB physical, 256 TiB virtual
Baseline:    x86-64
Techs:       htt
Extensions:  x87/fpu mmx extmmx amd64/x86-64 +lahf64 amd-v/vmx +svm=v1 aes-ni
SSE:         sse sse2 sse3 ssse3 sse4.2 sse4a
AVX:         avx avx2
AMX:         None
Mitigations:
ParaVirt.:   Hyper-V
Cache L1-D:    2x   32 KiB,   64 KiB total, si
Cache L1-I:    2x   32 KiB,   64 KiB total, si
Cache L2-U:    2x  512 KiB,    1 MiB total, si ci
Cache L3-U:    1x   32 MiB,   32 MiB total, si nwbv
kinke commented 2 years ago

[pushed a little extension to make LDC < v1.22 work again]

maxhaton commented 2 years ago

In some quick tests I did, this whitespace skipping approach doesn't seem to be particularly quick, I'm not convinced there's all that much point in keeping this at all.

RazvanN7 commented 2 years ago

Merging this to unblock the D-Scanner pipeline

edi33416 commented 2 years ago

In some quick tests I did, this whitespace skipping approach doesn't seem to be particularly quick, I'm not convinced there's all that much point in keeping this at all

I agree. While I like reading some assembly as much as the next person, I don't think we should use inline asm anymore. I think this code will hinder better optimizations, such as using the ymm or zmm registers for the newer cpus. I'm pretty sure that @dd86k 's 16 core would do better without the inline asm.

I believe that the inline asm should be removed in a future PR. This will prevent future issues like this one, and will enable the compiler to perform optimizations as the compiler and architectures evolve without requiring code patches.

RazvanN7 commented 2 years ago

This PR: https://github.com/dlang-community/D-Scanner/pull/866 confirms that this fix is correct, however, we still need a new tag so that the dub test also passes. cc @WebFreak001

WebFreak001 commented 2 years ago

ok will tag release

RazvanN7 commented 2 years ago

@WebFreak001 Is there anything else that needs to be done for the dub registry to pick up the new tag or does it simply take a while before it gets updated? I see that the dub registry [1] still thinks that 0.19.1 is the latest tag.

[1] https://code.dlang.org/packages/libdparse

WebFreak001 commented 2 years ago

ping @Hackerpilot

otherwise wait for 0~8 hours

RazvanN7 commented 2 years ago

ok, no problem, I can wait. I just wanted to make sure that we are on track. Thanks for your assistance!