opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.68k stars 1.79k forks source link

[BUG] Reproducible test failure .RemoteFSTranslogTests.testConcurrentWriteViewsAndSnapshot #5823

Closed mch2 closed 9 months ago

mch2 commented 1 year ago

Caught this seed while running local checks against 2.5 branch. Seed fails 100% of the time for me.

./gradlew ':server:test' --tests "org.opensearch.index.translog.RemoteFSTranslogTests.testConcurrentWriteViewsAndSnapshot" -Dtests.seed=7ED21C571F7C7EBE 

Trace:

~/workspace/OpenSearch (2.5)$ ./gradlew ':server:test' --tests "org.opensearch.index.translog.RemoteFSTranslogTests.testConcurrentWriteViewsAndSnapshot" -Dtests.seed=7ED21C571F7C7EBE 
Picked up JAVA_TOOL_OPTIONS: -Dlog4j2.formatMsgNoLookups=true
Starting a Gradle Daemon (subsequent builds will be faster)

> Configure project :qa:os
Cannot add task 'destructiveDistroTest.docker' as a task with that name already exists.
=======================================
OpenSearch Build Hamster says Hello!
  Gradle Version        : 7.6
  OS Info               : Mac OS X 12.6.1 (x86_64)
  JDK Version           : 17 (OpenJDK)
  JAVA_HOME             : /Users/handalm/.sdkman/candidates/java/17.0.2-open
  Random Testing Seed   : 7ED21C571F7C7EBE
  In FIPS 140 mode      : false
=======================================

> Task :server:compileJava
Picked up JAVA_TOOL_OPTIONS: -Dlog4j2.formatMsgNoLookups=true
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :test:framework:compileJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :server:compileTestJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :server:test
Picked up JAVA_TOOL_OPTIONS: -Dlog4j2.formatMsgNoLookups=true
OpenJDK 64-Bit Server VM warning: Ignoring option --illegal-access=warn; support was removed in 17.0
WARNING: A terminally deprecated method in java.lang.System has been called
WARNING: System::setSecurityManager has been called by org.opensearch.bootstrap.BootstrapForTesting (file:/Users/handalm/workspace/OpenSearch/test/framework/build/distributions/framework-2.5.0-SNAPSHOT.jar)
WARNING: Please consider reporting this to the maintainers of org.opensearch.bootstrap.BootstrapForTesting
WARNING: System::setSecurityManager will be removed in a future release
WARNING: A terminally deprecated method in java.lang.System has been called
WARNING: System::setSecurityManager has been called by org.gradle.api.internal.tasks.testing.worker.TestWorker (file:/Users/handalm/.gradle/wrapper/dists/gradle-7.6-all/9f832ih6bniajn45pbmqhk2cw/gradle-7.6/lib/plugins/gradle-testing-base-7.6.jar)
WARNING: Please consider reporting this to the maintainers of org.gradle.api.internal.tasks.testing.worker.TestWorker
WARNING: System::setSecurityManager will be removed in a future release

REPRODUCE WITH: ./gradlew ':server:test' --tests "org.opensearch.index.translog.RemoteFSTranslogTests.testConcurrentWriteViewsAndSnapshot" -Dtests.seed=7ED21C571F7C7EBE -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=zh-Hant-HK -Dtests.timezone=Asia/Manila -Druntime.java=17

org.opensearch.index.translog.RemoteFSTranslogTests > testConcurrentWriteViewsAndSnapshot FAILED
    java.io.IOException: Failed to upload 2 files during transfer
        at __randomizedtesting.SeedInfo.seed([7ED21C571F7C7EBE:5A0BE57AC4EB8D3D]:0)
        at org.opensearch.index.translog.transfer.TranslogTransferManager.transferSnapshot(TranslogTransferManager.java:121)
        at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:212)
        at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:195)
        at org.opensearch.index.translog.RemoteFsTranslog.ensureSynced(RemoteFsTranslog.java:145)
        at org.opensearch.index.translog.RemoteFSTranslogTests$2.doRun(RemoteFSTranslogTests.java:708)

Suite: Test class org.opensearch.index.translog.RemoteFSTranslogTests
  1> [2023-01-11T11:59:59,200][INFO ][o.o.i.t.RemoteFSTranslogTests] [testConcurrentWriteViewsAndSnapshot] before test
  1> [2023-01-11T12:00:01,277][INFO ][o.o.i.t.RemoteFSTranslogTests] [testConcurrentWriteViewsAndSnapshot] using [3] readers. [1] writers. flushing every ~[98] ops.
  1> [2023-01-11T12:00:01,285][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:01,285][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:01,285][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:03,415][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:03,415][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:03,415][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:07,612][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:07,769][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:07,844][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:14,273][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:14,503][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:14,726][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:23,690][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:23,690][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:23,690][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:35,113][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:35,113][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:35,516][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:49,118][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:49,118][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:49,560][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:05,716][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:05,716][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:06,227][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:24,912][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:26,053][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:26,053][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:47,606][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:48,884][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:49,606][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:02:00,603][ERROR][o.o.i.t.t.BlobStoreTransferService] [org.opensearch.index.translog.RemoteFSTranslogTests] Failed to upload blob translog-682.ckp
  1> java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.ckp-z1G2WGflQlKpQFIIIILBew: Too many open files
  1>    at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>    at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) [main/:?]
  1>    at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) [main/:?]
  1>    at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) [main/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> [2023-01-11T12:02:00,603][ERROR][o.o.i.t.t.BlobStoreTransferService] [org.opensearch.index.translog.RemoteFSTranslogTests] Failed to upload blob translog-682.tlog
  1> java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.tlog-8rOYLjgzRcaU_rJcefXo0Q: Too many open files
  1>    at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>    at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) [main/:?]
  1>    at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) [main/:?]
  1>    at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) [main/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> [2023-01-11T12:02:00,620][ERROR][o.o.i.t.t.TranslogTransferManager] [org.opensearch.index.translog.RemoteFSTranslogTests] Exception during transfer for file translog-682.tlog
  1> org.opensearch.index.translog.transfer.FileTransferException: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.tlog-8rOYLjgzRcaU_rJcefXo0Q: Too many open files
  1>    at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:58) [main/:?]
  1>    at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) [main/:?]
  1>    at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) [main/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> Caused by: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.tlog-8rOYLjgzRcaU_rJcefXo0Q: Too many open files
  1>    at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>    at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) ~[main/:?]
  1>    ... 6 more
  1> [2023-01-11T12:02:00,620][ERROR][o.o.i.t.t.TranslogTransferManager] [org.opensearch.index.translog.RemoteFSTranslogTests] Exception during transfer for file translog-682.ckp
  1> org.opensearch.index.translog.transfer.FileTransferException: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.ckp-z1G2WGflQlKpQFIIIILBew: Too many open files
  1>    at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:58) [main/:?]
  1>    at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) [main/:?]
  1>    at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) [main/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> Caused by: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.ckp-z1G2WGflQlKpQFIIIILBew: Too many open files
  1>    at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>    at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) ~[main/:?]
  1>    ... 6 more
  1> [2023-01-11T12:02:00,629][ERROR][o.o.i.t.t.TranslogTransferManager] [[writer_0]] Transfer failed for snapshot TranslogTransferSnapshot [ primary term = 211153147, generation = 682 ]
  1> java.io.IOException: Failed to upload 2 files during transfer
  1>    at org.opensearch.index.translog.transfer.TranslogTransferManager.transferSnapshot(TranslogTransferManager.java:121) [main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:212) [main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:195) [main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.ensureSynced(RemoteFsTranslog.java:145) [main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$2.doRun(RemoteFSTranslogTests.java:708) [test/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1>    Suppressed: org.opensearch.index.translog.transfer.FileTransferException: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.tlog-8rOYLjgzRcaU_rJcefXo0Q: Too many open files
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:58) ~[main/:?]
  1>            at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
  1>            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
  1>            at java.lang.Thread.run(Thread.java:833) [?:?]
  1>    Caused by: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.tlog-8rOYLjgzRcaU_rJcefXo0Q: Too many open files
  1>            at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) ~[main/:?]
  1>            ... 6 more
  1>    Suppressed: org.opensearch.index.translog.transfer.FileTransferException: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.ckp-z1G2WGflQlKpQFIIIILBew: Too many open files
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:58) ~[main/:?]
  1>            at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
  1>            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
  1>            at java.lang.Thread.run(Thread.java:833) [?:?]
  1>    Caused by: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.ckp-z1G2WGflQlKpQFIIIILBew: Too many open files
  1>            at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) ~[main/:?]
  1>            ... 6 more
  1> [2023-01-11T12:02:00,646][ERROR][o.o.i.t.RemoteFSTranslogTests] [[writer_0]] --> writer [writer_0] had an error
  1> java.io.IOException: Failed to upload 2 files during transfer
  1>    at org.opensearch.index.translog.transfer.TranslogTransferManager.transferSnapshot(TranslogTransferManager.java:121) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:212) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:195) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.ensureSynced(RemoteFsTranslog.java:145) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$2.doRun(RemoteFSTranslogTests.java:708) [test/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1>    Suppressed: org.opensearch.index.translog.transfer.FileTransferException: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.tlog-8rOYLjgzRcaU_rJcefXo0Q: Too many open files
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:58) ~[main/:?]
  1>            at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
  1>            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
  1>            at java.lang.Thread.run(Thread.java:833) [?:?]
  1>    Caused by: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.tlog-8rOYLjgzRcaU_rJcefXo0Q: Too many open files
  1>            at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) ~[main/:?]
  1>            ... 6 more
  1>    Suppressed: org.opensearch.index.translog.transfer.FileTransferException: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.ckp-z1G2WGflQlKpQFIIIILBew: Too many open files
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:58) ~[main/:?]
  1>            at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
  1>            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
  1>            at java.lang.Thread.run(Thread.java:833) [?:?]
  1>    Caused by: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.ckp-z1G2WGflQlKpQFIIIILBew: Too many open files
  1>            at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) ~[main/:?]
  1>            ... 6 more
  1> [2023-01-11T12:02:01,025][ERROR][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> reader [reader_2] had an error
  1> java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-002/translog-2.tlog: Too many open files
  1>    at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.newFileChannel(HandleTrackingFS.java:202) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at java.nio.channels.FileChannel.open(FileChannel.java:298) ~[?:?]
  1>    at java.nio.channels.FileChannel.open(FileChannel.java:357) ~[?:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot.<init>(FileSnapshot.java:45) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot$TransferFileSnapshot.<init>(FileSnapshot.java:116) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot$TranslogFileSnapshot.<init>(FileSnapshot.java:156) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.TranslogCheckpointTransferSnapshot$Builder.build(TranslogCheckpointTransferSnapshot.java:147) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:209) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:192) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.sync(RemoteFsTranslog.java:255) ~[main/:?]
  1>    at org.opensearch.index.translog.Translog.trimUnreferencedReaders(Translog.java:1748) ~[main/:?]
  1>    at org.opensearch.index.translog.Translog.lambda$acquireTranslogGenFromDeletionPolicy$12(Translog.java:735) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$3.closeRetentionLock(RemoteFSTranslogTests.java:762) ~[test/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$3.doRun(RemoteFSTranslogTests.java:816) [test/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> [2023-01-11T12:02:01,025][ERROR][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> reader [reader_1] had an error
  1> org.apache.lucene.store.AlreadyClosedException: translog [684] is already closed (path [/Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-002/translog-684.tlog]
  1>    at org.opensearch.index.translog.TranslogWriter.closeIntoReader(TranslogWriter.java:439) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:167) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.sync(RemoteFsTranslog.java:255) ~[main/:?]
  1>    at org.opensearch.index.translog.Translog.trimUnreferencedReaders(Translog.java:1748) ~[main/:?]
  1>    at org.opensearch.index.translog.Translog.lambda$acquireTranslogGenFromDeletionPolicy$12(Translog.java:735) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$3.closeRetentionLock(RemoteFSTranslogTests.java:762) ~[test/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$3.doRun(RemoteFSTranslogTests.java:816) [test/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> Caused by: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-002/translog-683.ckp: Too many open files
  1>    at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.newFileChannel(HandleTrackingFS.java:202) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at java.nio.channels.FileChannel.open(FileChannel.java:298) ~[?:?]
  1>    at java.nio.channels.FileChannel.open(FileChannel.java:357) ~[?:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot.<init>(FileSnapshot.java:45) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot$TransferFileSnapshot.<init>(FileSnapshot.java:116) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot$CheckpointFileSnapshot.<init>(FileSnapshot.java:193) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.TranslogCheckpointTransferSnapshot$Builder.build(TranslogCheckpointTransferSnapshot.java:147) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:209) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:190) ~[main/:?]
  1>    ... 7 more
  1>    Suppressed: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-002/translog-2.tlog: Too many open files
  1>            at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.newFileChannel(HandleTrackingFS.java:202) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at java.nio.channels.FileChannel.open(FileChannel.java:298) ~[?:?]
  1>            at java.nio.channels.FileChannel.open(FileChannel.java:357) ~[?:?]
  1>            at org.opensearch.index.translog.transfer.FileSnapshot.<init>(FileSnapshot.java:45) ~[main/:?]
  1>            at org.opensearch.index.translog.transfer.FileSnapshot$TransferFileSnapshot.<init>(FileSnapshot.java:116) ~[main/:?]
  1>            at org.opensearch.index.translog.transfer.FileSnapshot$TranslogFileSnapshot.<init>(FileSnapshot.java:156) ~[main/:?]
  1>            at org.opensearch.index.translog.transfer.TranslogCheckpointTransferSnapshot$Builder.build(TranslogCheckpointTransferSnapshot.java:147) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:209) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:192) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFsTranslog.sync(RemoteFsTranslog.java:255) ~[main/:?]
  1>            at org.opensearch.index.translog.Translog.trimUnreferencedReaders(Translog.java:1748) ~[main/:?]
  1>            at org.opensearch.index.translog.Translog.lambda$acquireTranslogGenFromDeletionPolicy$12(Translog.java:735) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$3.closeRetentionLock(RemoteFSTranslogTests.java:762) ~[test/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$3.doRun(RemoteFSTranslogTests.java:816) [test/:?]
  1>            at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>            at java.lang.Thread.run(Thread.java:833) [?:?]
  1> [2023-01-11T12:02:01,036][ERROR][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> reader [reader_0] had an error
  1> java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-002/translog-683.ckp: Too many open files
  1>    at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.newFileChannel(HandleTrackingFS.java:202) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at java.nio.channels.FileChannel.open(FileChannel.java:298) ~[?:?]
  1>    at java.nio.channels.FileChannel.open(FileChannel.java:357) ~[?:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot.<init>(FileSnapshot.java:45) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot$TransferFileSnapshot.<init>(FileSnapshot.java:116) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot$CheckpointFileSnapshot.<init>(FileSnapshot.java:193) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.TranslogCheckpointTransferSnapshot$Builder.build(TranslogCheckpointTransferSnapshot.java:147) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:209) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:190) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.sync(RemoteFsTranslog.java:255) ~[main/:?]
  1>    at org.opensearch.index.translog.Translog.trimUnreferencedReaders(Translog.java:1748) ~[main/:?]
  1>    at org.opensearch.index.translog.Translog.lambda$acquireTranslogGenFromDeletionPolicy$12(Translog.java:735) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$3.closeRetentionLock(RemoteFSTranslogTests.java:762) ~[test/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$3.doRun(RemoteFSTranslogTests.java:816) [test/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1>    Suppressed: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-002/translog-2.tlog: Too many open files
  1>            at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.newFileChannel(HandleTrackingFS.java:202) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at java.nio.channels.FileChannel.open(FileChannel.java:298) ~[?:?]
  1>            at java.nio.channels.FileChannel.open(FileChannel.java:357) ~[?:?]
  1>            at org.opensearch.index.translog.transfer.FileSnapshot.<init>(FileSnapshot.java:45) ~[main/:?]
  1>            at org.opensearch.index.translog.transfer.FileSnapshot$TransferFileSnapshot.<init>(FileSnapshot.java:116) ~[main/:?]
  1>            at org.opensearch.index.translog.transfer.FileSnapshot$TranslogFileSnapshot.<init>(FileSnapshot.java:156) ~[main/:?]
  1>            at org.opensearch.index.translog.transfer.TranslogCheckpointTransferSnapshot$Builder.build(TranslogCheckpointTransferSnapshot.java:147) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:209) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:192) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFsTranslog.sync(RemoteFsTranslog.java:255) ~[main/:?]
  1>            at org.opensearch.index.translog.Translog.trimUnreferencedReaders(Translog.java:1748) ~[main/:?]
  1>            at org.opensearch.index.translog.Translog.lambda$acquireTranslogGenFromDeletionPolicy$12(Translog.java:735) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$3.closeRetentionLock(RemoteFSTranslogTests.java:762) ~[test/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$3.doRun(RemoteFSTranslogTests.java:816) [test/:?]
  1>            at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>            at java.lang.Thread.run(Thread.java:833) [?:?]
  1> [2023-01-11T12:02:01,066][INFO ][o.o.i.t.RemoteFSTranslogTests] [testConcurrentWriteViewsAndSnapshot] after test
  2> REPRODUCE WITH: ./gradlew ':server:test' --tests "org.opensearch.index.translog.RemoteFSTranslogTests.testConcurrentWriteViewsAndSnapshot" -Dtests.seed=7ED21C571F7C7EBE -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=zh-Hant-HK -Dtests.timezone=Asia/Manila -Druntime.java=17
  2> java.io.IOException: Failed to upload 2 files during transfer
        at __randomizedtesting.SeedInfo.seed([7ED21C571F7C7EBE:5A0BE57AC4EB8D3D]:0)
        at org.opensearch.index.translog.transfer.TranslogTransferManager.transferSnapshot(TranslogTransferManager.java:121)
        at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:212)
        at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:195)
        at org.opensearch.index.translog.RemoteFsTranslog.ensureSynced(RemoteFsTranslog.java:145)
        at org.opensearch.index.translog.RemoteFSTranslogTests$2.doRun(RemoteFSTranslogTests.java:708)
  2> NOTE: leaving temporary files on disk at: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001
  2> NOTE: test params are: codec=Lucene94, sim=Asserting(RandomSimilarity(queryNorm=true): {}), locale=zh-Hant-HK, timezone=Asia/Manila
  2> NOTE: Mac OS X 12.6.1 x86_64/Eclipse Adoptium 17.0.5 (64-bit)/cpus=8,threads=1,free=255491192,total=536870912
  2> NOTE: All tests run in this JVM: [RemoteFSTranslogTests]

Tests with failures:
 - org.opensearch.index.translog.RemoteFSTranslogTests.testConcurrentWriteViewsAndSnapshot

1 test completed, 1 failed

> Task :server:test FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':server:test'.
> There were failing tests. See the report at: file:///Users/handalm/workspace/OpenSearch/server/build/reports/tests/test/index.html

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.

* Get more help at https://help.gradle.org

BUILD FAILED in 4m 31s
mch2 commented 1 year ago

@sachinpkale FYI

sachinpkale commented 1 year ago

Taking a look

sachinpkale commented 1 year ago

Fix is merged to main: https://github.com/opensearch-project/OpenSearch/pull/5789 Need to backport to 2.x and 2.5

sachinpkale commented 1 year ago

Backport PRs:

https://github.com/opensearch-project/OpenSearch/pull/5828 https://github.com/opensearch-project/OpenSearch/pull/5829

sachinpkale commented 1 year ago

The fix is merged and backported.

nknize commented 1 year ago

Heads up I ran into a different error when running :server:test locally before opening #8826. It does not repro on it's own. Maybe someone knows the cause of this and whether we should re-open this issue:

./gradlew ':server:test' --tests "org.opensearch.index.translog.RemoteFSTranslogTests.testConcurrentWriteViewsAndSnapshot" -Dtests.seed=11209ED8456F2E6A -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=sr-ME -Dtests.timezone=Pacific/Johnston -Druntime.java=20
  2> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=3157, name=writer_0, state=RUNNABLE, group=TGRP-RemoteFSTranslogTests]
        at __randomizedtesting.SeedInfo.seed([F8C495824172C1FF:DC1D6CAF9AE5327C]:0)

        Caused by:
        java.lang.AssertionError: [index][1] Expected non-empty readers
            at __randomizedtesting.SeedInfo.seed([F8C495824172C1FF]:0)
            at org.opensearch.index.translog.RemoteFsTranslog.deleteStaleRemotePrimaryTerms(RemoteFsTranslog.java:430)
            at org.opensearch.index.translog.RemoteFsTranslog.trimUnreferencedReaders(RemoteFsTranslog.java:400)
            at org.opensearch.index.translog.RemoteFSTranslogTests$2.doRun(RemoteFSTranslogTests.java:821)
  1> [2023-07-21T22:25:10,111][INFO ][o.o.i.t.RemoteFSTranslogTests] [testReadLocation] before test
  1> [2023-07-21T22:25:10,132][INFO ][o.o.i.t.RemoteFSTranslogTests] [testReadLocation] after test
  1> [2023-07-21T22:25:10,139][INFO ][o.o.i.t.RemoteFSTranslogTests] [testUploadWithPrimaryModeTrue] before test
  1> [2023-07-21T22:25:10,155][INFO ][o.o.i.t.RemoteFSTranslogTests] [testUploadWithPrimaryModeTrue] after test
  1> [2023-07-21T22:25:10,162][INFO ][o.o.i.t.RemoteFSTranslogTests] [testTranslogWriterFsyncDisabledInRemoteFsTranslog] before test
  1> [2023-07-21T22:25:10,190][INFO ][o.o.i.t.RemoteFSTranslogTests] [testTranslogWriterFsyncDisabledInRemoteFsTranslog] after test
  1> [2023-07-21T22:25:10,197][INFO ][o.o.i.t.RemoteFSTranslogTests] [testConcurrentWritesWithVaryingSize] before test
  1> [2023-07-21T22:25:10,207][INFO ][o.o.i.t.RemoteFSTranslogTests] [testConcurrentWritesWithVaryingSize] testing with [7] threads, each doing [14] ops
  1> [2023-07-21T22:25:10,440][INFO ][o.o.i.t.RemoteFSTranslogTests] [testConcurrentWritesWithVaryingSize] after test
  1> [2023-07-21T22:25:10,451][INFO ][o.o.i.t.RemoteFSTranslogTests] [testTranslogWriterCanFlushInAddOrReadCall] before test
  1> [2023-07-21T22:25:10,476][INFO ][o.o.i.t.RemoteFSTranslogTests] [testTranslogWriterCanFlushInAddOrReadCall] after test
  1> [2023-07-21T22:25:10,482][INFO ][o.o.i.t.RemoteFSTranslogTests] [testRangeSnapshot] before test
  1> [2023-07-21T22:25:10,532][INFO ][o.o.i.t.RemoteFSTranslogTests] [testRangeSnapshot] after test
  1> [2023-07-21T22:25:10,538][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperationsUpload] before test
  1> [2023-07-21T22:25:10,561][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperationsUpload] All md files [9223372035702745179__9223372036854775805__9223370346876465252__1, 9223372035702745179__9223372036854775804__9223370346876465248__1]
  1> [2023-07-21T22:25:10,563][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperationsUpload] All data files [translog-3.ckp, translog-1.tlog, translog-2.ckp, translog-3.tlog, translog-1.ckp, translog-2.tlog]
  1> [2023-07-21T22:25:10,563][ERROR][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperationsUpload] Asserting content of 2
  1> [2023-07-21T22:25:10,564][ERROR][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperationsUpload] Asserting content of 3
  1> [2023-07-21T22:25:10,567][INFO ][o.o.i.t.t.TranslogTransferManager] [testSimpleOperationsUpload] [index][1] Deleting primary terms from remote store lesser than 1152030628
  1> [2023-07-21T22:25:10,578][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperationsUpload] after test
  1> [2023-07-21T22:25:10,590][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSyncUpToStream] before test
  1> [2023-07-21T22:25:10,669][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSyncUpToStream] after test
  1> [2023-07-21T22:25:10,675][INFO ][o.o.i.t.RemoteFSTranslogTests] [testCloseIntoReader] before test
  1> [2023-07-21T22:25:10,697][INFO ][o.o.i.t.RemoteFSTranslogTests] [testCloseIntoReader] after test
  1> [2023-07-21T22:25:10,708][INFO ][o.o.i.t.RemoteFSTranslogTests] [testMetadataFileDeletion] before test
  1> [2023-07-21T22:25:10,745][INFO ][o.o.i.t.t.TranslogTransferManager] [testMetadataFileDeletion] [index][1] Deleting primary terms from remote store lesser than 1979622222
  1> [2023-07-21T22:25:10,770][INFO ][o.o.i.t.RemoteFSTranslogTests] [testMetadataFileDeletion] numDocs=7 moreDocs=4
  1> [2023-07-21T22:25:10,813][INFO ][o.o.i.t.t.TranslogTransferManager] [testMetadataFileDeletion] [index][1] Downloading translog files with: Primary Term = 1979622222, Generation = 13, Location = /opt/dev/opensearch-project/opensearch/.worktrees/enhance/mediaTypeParserRegistry/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_F8C495824172C1FF-001/tempDir-044
  1> [2023-07-21T22:25:10,814][INFO ][o.o.i.t.t.TranslogTransferManager] [testMetadataFileDeletion] [index][1] Downloading translog files with: Primary Term = 1979622222, Generation = 12, Location = /opt/dev/opensearch-project/opensearch/.worktrees/enhance/mediaTypeParserRegistry/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_F8C495824172C1FF-001/tempDir-044
  1> [2023-07-21T22:25:10,831][INFO ][o.o.i.t.t.TranslogTransferManager] [testMetadataFileDeletion] [index][1] Deleting primary terms from remote store lesser than 1979622223
  1> [2023-07-21T22:25:10,836][INFO ][o.o.i.t.t.TranslogTransferManager] [org.opensearch.index.translog.RemoteFSTranslogTests] [index][1] Deleted primary term 1979622222
  1> [2023-07-21T22:25:10,863][INFO ][o.o.i.t.RemoteFSTranslogTests] [testMetadataFileDeletion] after test
  1> [2023-07-21T22:25:10,873][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperations] before test
  1> [2023-07-21T22:25:10,898][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperations] after test
  2> NOTE: test params are: codec=Asserting(Lucene95): {}, docValues:{}, maxPointsInLeafNode=948, maxMBSortInHeap=5.221747612462641, sim=Asserting(RandomSimilarity(queryNorm=true): {}), locale=he-IL, timezone=America/Danmarkshavn
  2> NOTE: Linux 5.17.0-1033-oem amd64/Eclipse Adoptium 20.0.1 (64-bit)/cpus=24,threads=1,free=424241208,total=536870912
  2> NOTE: All tests run in this JVM: [DynamicActionRegistryTests, AddVotingConfigExclusionsRequestTests, DecommissionResponseTests, CancelTasksRequestTests, ClusterGetSettingsResponseTests, SnapshotIndexShardStatusTests, MappingVisitorTests, GetAliasesResponseTests, CreateIndexResponseTests, GetIndexActionTests, ResolveIndexResponseTests, UpdateSettingsRequestSerializationTests, GetIndexTemplatesResponseTests, BulkRequestModifierTests, DeleteResponseTests, TransportMultiGetActionTests, SimulateProcessorResultTests, CreatePitControllerTests, SearchPhaseExecutionExceptionTests, TransportMultiSearchActionTests, RetryableActionTests, TransportWriteActionForIndexingPressureTests, JavaVersionTests, NodeClientHeadersTests, ShardFailedClusterStateTaskExecutorTests, ClusterBootstrapServiceRenamedSettingTests, LeaderCheckerTests, DecommissionControllerTests, ComponentTemplateTests, IndexAbstractionTests, MetadataIndexStateServiceTests, DiscoveryNodeTests, PrimaryTermsTests, AllocationConstraintsTests, DecisionsImpactOnClusterHealthTests, MaxRetryAllocationDeciderTests, RemoteShardsMoveShardsTests, TenShardsOneReplicaRoutingTests, RestoreInProgressAllocationDeciderTests, TaskBatcherTests, RoundingTests, CompositeBytesReferenceTests, AutoCloseableRefCountedTests, GeometryIndexerTests, PointBuilderTests, HeaderWarningTests, MinScoreScorerTests, NetworkUtilsTests, MemorySizeSettingsTests, JavaDateMathParserTests, ByteUtilsTests, ReorganizingLongHashTests, FutureUtilsTests, SizeBlockingQueueTests, JsonVsCborTests, JacksonLocationTests, SettingsBasedSeedHostsProviderTests, ExtensionActionUtilTests, RegisterCustomSettingsTests, PriorityComparatorTests, IndexingPressureServiceTests, ShardIndexingPressureTests, PreConfiguredTokenFilterTests, EngineConfigFactoryTests, RecoverySourcePruneMergePolicyTests, NoOrdinalsStringFieldDataTests, BinaryFieldMapperTests, DocCountFieldMapperTests, FieldAliasMapperValidationTests, GeoShapeFieldTypeTests, KeywordFieldTypeTests, NumberFieldTypeTests, SourceFieldMapperTests, CombineIntervalsSourceProviderTests, GeoBoundingBoxQueryBuilderTests, MatchNoneQueryBuilderTests, QueryStringQueryBuilderTests, SpanFirstQueryBuilderTests, WildcardQueryBuilderTests, DeleteByQueryRequestTests, MultiMatchQueryTests, GlobalCheckpointSyncActionTests, RetentionLeasesTests, PrimaryReplicaSyncerTests, ShardUtilsTests, RemoteBufferedOutputDirectoryTests, FileCacheCleanerTests, RemoteFSTranslogTests]
mch2 commented 1 year ago

Hit this again with - https://github.com/opensearch-project/OpenSearch/pull/9743#issuecomment-1730186485

sachinpkale commented 9 months ago

Taking a look

sachinpkale commented 9 months ago

Ran on local env 25K+ times without any failures. Closing.