We recently found out a issue where shallow copy snapshots are failing for closed indices. However full copy snapshots succeeds for those indices.
Snapshot shard failed
java.nio.file.NoSuchFileException: Metadata file is not present for given primary term 2 and generation 6
at org.opensearch.index.store.RemoteSegmentStoreDirectory.getMetadataFileForCommit(RemoteSegmentStoreDirectory.java:527)
at org.opensearch.index.store.RemoteSegmentStoreDirectory.acquireLock(RemoteSegmentStoreDirectory.java:480)
at org.opensearch.index.shard.IndexShard.acquireLockOnCommitData(IndexShard.java:1655)
at org.opensearch.snapshots.SnapshotShardsService.snapshot(SnapshotShardsService.java:631)
at org.opensearch.snapshots.SnapshotShardsService$1.doRun(SnapshotShardsService.java:393)
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractPrioritizedRunnable.doRun(ThreadContext.java:979)
at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
For shallow copy snapshots, we refer latest remote store data and acquire a lock on that data. since the indices are closed no new data is being written to remote store which should get triggered as part of snapshot flush. this is causing snapshots to fail.
Related component
Storage:Snapshots
To Reproduce
Create a remote store enabled cluster.
Create indices and close them.
Register a snapshot repository and enable shallow copy snapshots or use system repository created during cluster creation.
Trigger snapshot, it will fail.
Expected behavior
Snapshots should pass.
Additional Details
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
OS: [e.g. iOS]
Version [e.g. 22]
Additional context
Add any other context about the problem here.
Describe the bug
We recently found out a issue where shallow copy snapshots are failing for closed indices. However full copy snapshots succeeds for those indices.
For shallow copy snapshots, we refer latest remote store data and acquire a lock on that data. since the indices are closed no new data is being written to remote store which should get triggered as part of snapshot flush. this is causing snapshots to fail.
Related component
Storage:Snapshots
To Reproduce
Expected behavior
Snapshots should pass.
Additional Details
Plugins Please list all plugins currently enabled.
Screenshots If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
Additional context Add any other context about the problem here.