ben-manes / caffeine

A high performance caching library for Java
Apache License 2.0
15.87k stars 1.6k forks source link

Help for running the test suite #1280

Closed chayan-blip closed 1 year ago

chayan-blip commented 1 year ago

Hi Ben,

This may be a beginner question but I have tried running the test suite through intellij and also through command line. I have a modern linux machine but running millions of tests is taking lot of time and is hanging repeatedly. Sometimes there is a GC OutOfMemory error etc. I have configured heap sizes to the max but to no avail. Any help related to the same will be greatly appreciated.

Thanks Chayan

ben-manes commented 1 year ago

The command line should work as we configure the memory requirements for the suite. I usually let the CI do it all instead due to the size.

I’ll run specific test methods in the ide (Eclipse). The ReferenceTests document required jvm args for soft references.

The only way that I could cope features and complexity was to brute force testing to discover regressions.

sorry, I’m unsure how to help unless you can offer more details.

ben-manes commented 1 year ago

For the IDE, you might be running into OutOfMemoryError due to the test report? Unfortunately the test reporters for junit and testng tend to be in-memory only and write to disk only after the test phase completes. They also hold onto a lot of metadata per test invocation, like the parameters, rather than stingify the results early. Ideally they would stream their contents out to an intermediate event format as the I/O is cheap, the fs buffer cache would make the reread inexpensive, and it would avoid memory bloat. For the gradle report our test listener performs a clean-up to bypass this problem, but the IDE's probably doesn't benefit enough and that's causing your failure. Unfortunately there is nothing we can do, the test frameworks or IDEs need to be more memory/performance conscious, but that is not really something they put much consideration towards.

At command-line it works fine for me on a 2016 macbook pro w/ 16gb memory. To ensure it was a fresh build and not reusing the remote cache I ran it as,

./gradlew clean build --no-build-cache --console plain

You can skip the test phase, either as build -x test or compileJava. You only need a compile phase for code generation and can optionally run tests or defer them to the CI. The github build action uses a matrix to parallelize the tests as multiple jobs, and testng is set to parallelize the methods. If running locally and gradle's task parallelization is causing excessive load by forking vms, you could use --no-parallel. The gradle daemon and tasks all set their memory requirements which means you don't need to adjust your local and it should work fine. At worst case, you might see timeout failures if the system is under too much load from other test tasks.

The build runs using Java 11 by default and will install that toolchain automatically. If you want to run using a later JDK then set JAVA_VERSION prior to executing, e.g. JAVA_VERSION=21 gradlew .... The CI builds and benchmarks using multiple JDKs, but only runs the test suite against 11. If switching to a newer JDK then beware that implementation changes could cause test failures to investigate.

In general you shouldn't need to run the test suite to work with this repository. Just compile, import, and run specific test methods if making changes. Then push to a branch and let the CI crunch. If you make a mistake then the tests will fail quickly with enough information to debug it, such as the cache configuration used. Given this project's complexity an exhaustive test suite was necessary to avoid regressions

chayan-blip commented 1 year ago

Thanks for the detailed reply. Actually I am trying to learn the codebase by stepping through it using a debugger. However when I am running the test in Intellij using the green debug button and setting a breakpoint, the UI is asking me to select a task. When I choose any task other than "test" it flags saying there is no test found. When I am choosing "test" it is running the whole suite. I have been trying since last few weeks to debug a single test. Once I am able to run and debug any test that would help me to learn the whole test suite and hopefully contribute. Thanks!

ben-manes commented 1 year ago

oh, hmm. I use Eclipse which does not require running tests through Gradle. Instead it runs it directly, which lets me run the specific method with a breakpoint. That may run a thousand tests if unblocked and not force millions that causes memory issues. Is there a way to make IntelliJ less naive?

In gradle you can run specific tests, e.g.

gradlew caffeine:strongKeysAndStrongValuesSyncCaffeineTest --tests 'CacheTest.estimatedSize'
ben-manes commented 1 year ago

It looks like you can configure it to run in the ide as below. Then when the test runs it won't ask for a task and you can debug it normally. For me it runs some tests and then fails with a generated class not found, even though it is indexed in the IDE, so I am unsure why it is excluding it from the classpath (it seems to exclude anything in the build/generated-sources dir). I think someone who knows IntelliJ better might be able to improve our build configuration as I had originally only done a sanity test wrt build / compile / static analyze.

Screen Shot 2023-10-30 at 4 12 30 PM Screen Shot 2023-10-30 at 4 13 08 PM
ben-manes commented 1 year ago

Adding the generated code as source folders and idea doesn’t allow them onto the classpath. I tried a few variations and it seems very opinionated to force broken behavior. Unfortunately I’m not the right person to debug and fix this, so you might need to try an alternative ide or do some sleuthing.

ben-manes commented 1 year ago

I got a little farther by manually adding the codeGen module as a compile dependency in the test module. It then ran a single test correctly, but I do not know how to configure this in Gradle yet

Screen Shot 2023-11-01 at 6 06 08 PM Screen Shot 2023-11-01 at 6 28 32 PM

It would need to add this line into .idea/modules/caffeine/caffeine.caffeine.test.iml

<orderEntry type="module" module-name="caffeine.caffeine.codeGen" />
ben-manes commented 1 year ago

@chayan-blip this seemed to fix IntelliJ,

dependencies {
  testImplementation(sourceSets["codeGen"].output)
}

And then setting the Gradle Tool Settings to run tests using IntelliJ works out of the box. However, I don't know how to set that automatically. This generates the following line in .idea/gradle.xml

<option name="testRunner" value="PLATFORM" />

I think we can do that using an extension Idea plugin, like this sample.

The dependency change should be fine (exists for jmh) but I have not tried a fresh Eclipse import, etc. to verify. However that should unblock you and maybe we can get this cleaned up further for a good first experience.

chayan-blip commented 1 year ago

Thanks a lot for the effort! I am an hardware engineer working with verilog, who is interested in software too. However I was thinking if it is possible to create an onboarding document, to bring people upto speed.Like some links to resources which people like me who want to contribute quality code, can use to learn. I know there is plethora of online resources but a list of recommended resources can be very helpful. Something like this

  1. Java - ~25 hours
  2. Build tools - ~2 hours
  3. Basics of caching - ~2 hours
  4. Testing and debugging - ~5 hours
  5. Multithreading concepts + Caffeine detailed docs - ~10 hours

Something along these lines. I can begin by also contributing some documentation which is generally much needed in open source. Although the code is beautifully commented but I was hoping to contribute some diagrams and summary of documents you have linked as well as the general architecture and organization of the codebase.

Looking forward to collaborate. Chayan

ben-manes commented 1 year ago

That's great. I earned a hardware degree (B.S./M.S.), but went directly into software.

I think that I have an initial import configured correctly now, but would like to spend some more time polishing it in case there are more settings that I can import (like code styles). You can try out the v3.dev branch (commit). I was able to take a fresh import and run a single test.

Documentation is always hard, especially finding a balance as users don't want to wade through too much. For a new dev is it sparse with the Contribute and Design wiki pages, and then articles / papers / javadoc on more algorithmic details. Any help here to make it friendlier is always appreciated.

ben-manes commented 12 months ago

The IntelliJ import works pretty well now so you shouldn’t have a problem anymore. There is a very odd bug when running applications or subproject tasks like the simulator, but running via the root project is somehow fine. Since you probably won’t need to do that, you should be all set.

chayan-blip commented 12 months ago

Hi Ben, Cache basics.pdf I will surely go through. the setup Here is a one pager pdf for getting the very basics of cache workings . We can modify and append as required. I am sharing a pdf as of now as its safe and universal format. Will start comitting to the wiki once I become little familiar with the software stuff.

Thanks