apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.45k stars 973 forks source link

Add a separate option to allow running Panama Vectorization for all tests with suitable C2 defaults #13351

Closed ChrisHegarty closed 1 month ago

ChrisHegarty commented 1 month ago

This commit adds a separate option, tests.defaultvectorization, to allow running Panama Vectorization for all tests with suitable C2 defaults.

For example:

./gradlew :lucene:core:test -Ptests.defaultvectorization=true

A default value has also been added to the value in the random picks of vector bit size, which effectively enables the Panama Vector similarities implementation for all tests except TestVectorUtilSupport. This will cover more cases in the general workflow.

When modify or testing the Panama Vector similarities implementation directly, namely TestVectorUtilSupport, the following pattern continues to work:

for bits in default 128 256 512; do ./gradlew -p lucene/core test --tests TestVectorUtilSupport -Dtests.vectorsize=$bits; done

relates #13344

uschindler commented 1 month ago

I have a little bit of problem with that. Our CI infrastructure passes CI=true and basically we only want there that the "make tests fast" JVM option is removed (it removes the setting to disable C2 compiler). The disabled C2 compiler is actually the reason why we disable the vectorization tests for normal developers, because they would run insanely slow without C2.

But actually the CI is there to test all variants and it does not matter how slow it is. So actually the change here is contradictory to what CI should do.

So my idea would be to just add another test property -Ptests.useProductiveMode=true (or similar) which then implies:

So that means: Any developer who wants to test the productive features of Lucene, needs to pass the property. This would slowdown tests a bit and vectorization is always used in the way how it would be in production.

Other ideas how to name this parameter:

or any else. The changes are quite simple. We just need to make some extra code to override the other ones if one of the above is setup.

uschindler commented 1 month ago

I am trying to implement this, but actually there is some horrible code duplication in test-defaults.gradle and randomization.gradle. It resolves all test options 2 times and then it gets inconsistent... @dweiss

Working on it.

Uwe

uschindler commented 1 month ago

My problem is: randomization.gradle resolves the test options lazily in afterEvaluate and assigns them to ext.testOptionsResolved.

In contrast defaults-tests.gradle evaluates them non-lazily and assigns them to resolvedTestOption, very similar but different.

The jvmArges are then assigned early, but later any changes get lost (because the randomization one) decides to add everthing later.

Why is this?

uschindler commented 1 month ago

Hrm, I am hopeless at moment. Need a bit of freetime to walk around. The issue is that jvmArgs are resolved early, while the test properties are done in randomization.gradle. This causes the issue that I cant undo the jvmArgs because they are added early to the config.

Need to rewrite that and add the JVM options late.

P.S.: Let's please merge the default-tests.gradle and randomization.gradle! This code is unreadable! All in one file would be much better.

dweiss commented 1 month ago

P.S.: Let's please merge the default-tests.gradle and randomization.gradle! This code is unreadable! All in one file would be much better.

I'm pretty sure there was some convoluted reason to keep them this way... But I'm not sure - it's been quite a while. I thinkmerging these files is a good idea - they do seem very closely related. I'm also fairly sure the after-evaluation trick was intentional but can't remember why.

uschindler commented 1 month ago

I found a solution. It was a bit tricky, but works. Will commit to this branch.

uschindler commented 1 month ago

ok, I pushed my changes:

The code is now quite clean, I did not yet cleanup the two source files. I only had to move the jvmArgs gradle test task setting to the afterEvaluate. I added a separate method to project.ext to evaluate the tests.defaultvectorization.

We should really merge those 2 files as they are really depending on each other. And maybe decide to resolve all test options late (in afterEvaluate).

uschindler commented 1 month ago

Now let's go to vacation and before that have beers on German "Vatertag". Please backport those changes here, too as this bug also affects 9.x

uschindler commented 1 month ago

I added a bit of code to make the integer vector enforcement always false if the default vectorization settings are used.

uschindler commented 1 month ago

I tested the combination by repeatedly executing: gradlew :lucene:core:testOpts, appending several parameters. All combinations seems right:

ChrisHegarty commented 1 month ago

There is still one small issue. C2 will still be disabled if default is randomly selected, e.g.

$ ./gradlew :lucene:core:testOpts | egrep ".*defaultvectorization.*|.*vectorsize.*|.*forceintegervectors.*|.*jvmargs.*"
  tests.defaultvectorization = false    # Uses defaults for running tests with correct JVM settings to test Panama vectorization (tests.jvmargs, tests.vectorsize, tests.forceintegervectors).
C tests.forceintegervectors = false    # (!= default: computed) Forces use of integer vectors even when slow.
C tests.jvmargs            = -XX:TieredStopAtLevel=1 -XX:+UseParallelGC -XX:ActiveProcessorCount=1 # (!= default: computed) Arguments passed to each forked JVM.
C tests.vectorsize         = default  # (!= default: computed) Sets preferred vector size in bits.

we want to enable C2 if the Panama Vector is enabled.

ChrisHegarty commented 1 month ago

I updated the default setting of tests.defaultvectorization to include when default is randomly selected. E.g.

$ ./gradlew :lucene:core:testOpts | egrep ".*defaultvectorization.*|.*vectorsize.*|.*forceintegervectors.*|.*jvmargs.*"
C tests.defaultvectorization = true     # (!= default: computed) Uses defaults for running tests with correct JVM settings to test Panama vectorization (tests.jvmargs, tests.vectorsize, tests.forceintegervectors).
C tests.forceintegervectors = false    # (!= default: computed) Forces use of integer vectors even when slow.
C tests.jvmargs            =          # (!= default: computed) Arguments passed to each forked JVM.
C tests.vectorsize         = default  # (!= default: computed) Sets preferred vector size in bits.
uschindler commented 1 month ago

There is still one small issue. C2 will still be disabled if default is randomly selected, e.g.

$ ./gradlew :lucene:core:testOpts | egrep ".*defaultvectorization.*|.*vectorsize.*|.*forceintegervectors.*|.*jvmargs.*"
  tests.defaultvectorization = false    # Uses defaults for running tests with correct JVM settings to test Panama vectorization (tests.jvmargs, tests.vectorsize, tests.forceintegervectors).
C tests.forceintegervectors = false    # (!= default: computed) Forces use of integer vectors even when slow.
C tests.jvmargs            = -XX:TieredStopAtLevel=1 -XX:+UseParallelGC -XX:ActiveProcessorCount=1 # (!= default: computed) Arguments passed to each forked JVM.
C tests.vectorsize         = default  # (!= default: computed) Sets preferred vector size in bits.

we want to enable C2 if the Panama Vector is enabled.

Actually that's wanted: Because we want tests by default run fast. For non-developers the CI env var sets empty jvmargs because on CI speed does not matter. With your latest change we randomly slowdown the tests.

So please revert your commit.

ChrisHegarty commented 1 month ago

Actually that's wanted: Because we want tests by default run fast. For that the CI env var sets empty jvmargs. With your latest change we randomly slowdown the tests.

It's not clear whether tests will run faster when Panama Vector is enabled without C2 or not. I'll check this.

So please revert your commit.

Happy to revert, if it turns out to be faster without it.

uschindler commented 1 month ago

Actually that's wanted: Because we want tests by default run fast. For that the CI env var sets empty jvmargs. With your latest change we randomly slowdown the tests.

It's not clear whether tests will run faster when Panama Vector is enabled without C2 or not. I'll check this.

So please revert your commit.

Happy to revert, if it turns out to be faster without it.

The reason for the jvmargs is to not enable C2 for our short running tests with lots of randomization. This causes dramatic overhead (Robert tested this) on total test runtime. So we use the default C2 JVM settings only on CI, where 50% longer total runtime does not matter.

Without that hack we could revert a lot of code.

uschindler commented 1 month ago

See: https://github.com/apache/lucene/issues/10200

ChrisHegarty commented 1 month ago

The reason for the jvmargs is to not enable C2 for our short running tests with lots of randomization. This causes dramatic overhead (Robert tested this) on total test runtime. So we use the default C2 JVM settings only on CI, where 50% longer total runtime does not matter.

Yes, I get this. I was thinking that running vector tests with Panama Vector enabled and C2 disabled will be slow, but it seems to make little difference - I guess we don't have very stressful tests in this area.

I'll revert my latest change, so that C2 will be enabled/disabled regardless of Panama Vector. Then merge and backport.

ChrisHegarty commented 1 month ago

There is a problem with default - it does not do what you might expect it to do - it does not enable Panama Vector (unless you prevent C2 from being disabled). For example:

$ ./gradlew :lucene:core:test --tests TestKnnByteVectorQuery.testVectorEncodingMismatch \
    -Ptests.vectorsize="default" -Ptests.forceintegervectors=false --info
...
org.apache.lucene.search.TestKnnByteVectorQuery > testVectorEncodingMismatch STANDARD_ERROR
    M5 09, 2024 3:26:42 KANG’AMA org.apache.lucene.internal.vectorization.VectorizationProvider lookup
    WARNING: C2 compiler is disabled; Java vector incubator API can't be enabled
...

The addition of default is opportunistic while we are here, but the fundamental value add of this PR is to add a way to more easily run with the "real world" vector similarities, which we can do with tests.defaultvectorization. Let's revert the addition of default. It's not adding any real value here.

uschindler commented 1 month ago

Hi, good point.

This simplifies logic more. I would also like to move the jvmargs over to randomization.gradle.

uschindler commented 1 month ago

Hm, actually when we keep "default" in the random pick, the CI builds sometimes use real conditions, so we should keep it!

Let's keep the current state.

ChrisHegarty commented 1 month ago

Hm, actually when we keep "default" in the random pick, the CI builds sometimes use real conditions, so we should keep it!

Agreed. I'm declaring that this PR is done. I will merge and backport. @uschindler now go on vacation and stop replying! ;-)

uschindler commented 1 month ago

My last comment here: It would be good to somehow document this in the file so it is clear why all those combinations discussed here are "correct", although they seem to be wrong. After my vacation I will merge the randomization.gradle and defaults-tests.gradle files and look again at the lazy evaluation and document this more.

P.S.: Backport?

uschindler commented 1 month ago

Thanks!