Fixed Java SstFileMetaData to prevent throwing java.lang.NoSuchMethodError
Fixed a regression when ColumnFamilyOptions::max_successive_merges > 0 where the CPU overhead for deciding whether to merge could have increased unless the user had set the option ColumnFamilyOptions::strict_max_successive_merges
RocksDB 9.1.0
9.1.0 (2024-03-22)
New Features
Added an option, GetMergeOperandsOptions::continue_cb, to give users the ability to end GetMergeOperands()'s lookup process before all merge operands were found.
*Add sanity checks for ingesting external files that currently checks if the user key comparator used to create the file is compatible with the column family's user key comparator.
*Support ingesting external files for column family that has user-defined timestamps in memtable only enabled.
On file systems that support storage level data checksum and reconstruction, retry SST block reads for point lookups, scans, and flush and compaction if there's a checksum mismatch on the initial read.
Some enhancements and fixes to experimental Temperature handling features, including new default_write_temperature CF option and opening an SstFileWriter with a temperature.
WriteBatchWithIndex now supports wide-column point lookups via the GetEntityFromBatch API. See the API comments for more details.
*Implement experimental features: API Iterator::GetProperty("rocksdb.iterator.write-time") to allow users to get data's approximate write unix time and write data with a specific write time via WriteBatch::TimedPut API.
Public API Changes
Best-effort recovery (best_efforts_recovery == true) may now be used together with atomic flush (atomic_flush == true). The all-or-nothing recovery guarantee for atomically flushed data will be upheld.
Remove deprecated option bottommost_temperature, already replaced by last_level_temperature
Added new PerfContext counters for block cache bytes read - block_cache_index_read_byte, block_cache_filter_read_byte, block_cache_compression_dict_read_byte, and block_cache_read_byte.
Deprecate experimental Remote Compaction APIs - StartV2() and WaitForCompleteV2() and introduce Schedule() and Wait(). The new APIs essentially does the same thing as the old APIs. They allow taking externally generated unique id to wait for remote compaction to complete.
*For API WriteCommittedTransaction::GetForUpdate, if the column family enables user-defined timestamp, it was mandated that argument do_validate cannot be false, and UDT based validation has to be done with a user set read timestamp. It's updated to make the UDT based validation optional if user sets do_validate to false and does not set a read timestamp. With this, GetForUpdate skips UDT based validation and it's users' responsibility to enforce the UDT invariant. SO DO NOT skip this UDT-based validation if users do not have ways to enforce the UDT invariant. Ways to enforce the invariant on the users side include manage a monotonically increasing timestamp, commit transactions in a single thread etc.
Defined a new PerfLevel kEnableWait to measure time spent by user threads blocked in RocksDB other than mutex, such as a write thread waiting to be added to a write group, a write thread delayed or stalled etc.
RateLimiter's API no longer requires the burst size to be the refill size. Users of NewGenericRateLimiter() can now provide burst size in single_burst_bytes. Implementors of RateLimiter::SetSingleBurstBytes() need to adapt their implementations to match the changed API doc.
Add write_memtable_time to the newly introduced PerfLevel kEnableWait.
Behavior Changes
RateLimiters created by NewGenericRateLimiter() no longer modify the refill period when SetSingleBurstBytes() is called.
Merge writes will only keep merge operand count within ColumnFamilyOptions::max_successive_merges when the key's merge operands are all found in memory, unless strict_max_successive_merges is explicitly set.
Bug Fixes
Fixed kBlockCacheTier reads to return Status::Incomplete when I/O is needed to fetch a merge chain's base value from a blob file.
Fixed kBlockCacheTier reads to return Status::Incomplete on table cache miss rather than incorrectly returning an empty value.
Fixed a data race in WalManager that may affect how frequent PurgeObsoleteWALFiles() runs.
Re-enable the recycle_log_file_num option in DBOptions for kPointInTimeRecovery WAL recovery mode, which was previously disabled due to a bug in the recovery logic. This option is incompatible with WriteOptions::disableWAL. A Status::InvalidArgument() will be returned if disableWAL is specified.
Performance Improvements
Java API multiGet() variants now take advantage of the underlying batched multiGet() performance improvements.
Before
Fixed Java SstFileMetaData to prevent throwing java.lang.NoSuchMethodError
Fixed a regression when ColumnFamilyOptions::max_successive_merges > 0 where the CPU overhead for deciding whether to merge could have increased unless the user had set the option ColumnFamilyOptions::strict_max_successive_merges
9.1.0 (03/22/2024)
New Features
Added an option, GetMergeOperandsOptions::continue_cb, to give users the ability to end GetMergeOperands()'s lookup process before all merge operands were found.
*Add sanity checks for ingesting external files that currently checks if the user key comparator used to create the file is compatible with the column family's user key comparator.
*Support ingesting external files for column family that has user-defined timestamps in memtable only enabled.
On file systems that support storage level data checksum and reconstruction, retry SST block reads for point lookups, scans, and flush and compaction if there's a checksum mismatch on the initial read.
Some enhancements and fixes to experimental Temperature handling features, including new default_write_temperature CF option and opening an SstFileWriter with a temperature.
WriteBatchWithIndex now supports wide-column point lookups via the GetEntityFromBatch API. See the API comments for more details.
*Implement experimental features: API Iterator::GetProperty("rocksdb.iterator.write-time") to allow users to get data's approximate write unix time and write data with a specific write time via WriteBatch::TimedPut API.
Public API Changes
Best-effort recovery (best_efforts_recovery == true) may now be used together with atomic flush (atomic_flush == true). The all-or-nothing recovery guarantee for atomically flushed data will be upheld.
Remove deprecated option bottommost_temperature, already replaced by last_level_temperature
Added new PerfContext counters for block cache bytes read - block_cache_index_read_byte, block_cache_filter_read_byte, block_cache_compression_dict_read_byte, and block_cache_read_byte.
Deprecate experimental Remote Compaction APIs - StartV2() and WaitForCompleteV2() and introduce Schedule() and Wait(). The new APIs essentially does the same thing as the old APIs. They allow taking externally generated unique id to wait for remote compaction to complete.
*For API WriteCommittedTransaction::GetForUpdate, if the column family enables user-defined timestamp, it was mandated that argument do_validate cannot be false, and UDT based validation has to be done with a user set read timestamp. It's updated to make the UDT based validation optional if user sets do_validate to false and does not set a read timestamp. With this, GetForUpdate skips UDT based validation and it's users' responsibility to enforce the UDT invariant. SO DO NOT skip this UDT-based validation if users do not have ways to enforce the UDT invariant. Ways to enforce the invariant on the users side include manage a monotonically increasing timestamp, commit transactions in a single thread etc.
Defined a new PerfLevel kEnableWait to measure time spent by user threads blocked in RocksDB other than mutex, such as a write thread waiting to be added to a write group, a write thread delayed or stalled etc.
RateLimiter's API no longer requires the burst size to be the refill size. Users of NewGenericRateLimiter() can now provide burst size in single_burst_bytes. Implementors of RateLimiter::SetSingleBurstBytes() need to adapt their implementations to match the changed API doc.
Add write_memtable_time to the newly introduced PerfLevel kEnableWait.
Behavior Changes
RateLimiters created by NewGenericRateLimiter() no longer modify the refill period when SetSingleBurstBytes() is called.
Merge writes will only keep merge operand count within ColumnFamilyOptions::max_successive_merges when the key's merge operands are all found in memory, unless strict_max_successive_merges is explicitly set.
Bug Fixes
Fixed kBlockCacheTier reads to return Status::Incomplete when I/O is needed to fetch a merge chain's base value from a blob file.
Fixed kBlockCacheTier reads to return Status::Incomplete on table cache miss rather than incorrectly returning an empty value.
Fixed a data race in WalManager that may affect how frequent PurgeObsoleteWALFiles() runs.
Re-enable the recycle_log_file_num option in DBOptions for kPointInTimeRecovery WAL recovery mode, which was previously disabled due to a bug in the recovery logic. This option is incompatible with WriteOptions::disableWAL. A Status::InvalidArgument() will be returned if disableWAL is specified.
Performance Improvements
Java API multiGet() variants now take advantage of the underlying batched multiGet() performance improvements.
Before
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Bumps org.rocksdb:rocksdbjni from 9.0.0 to 9.1.1.
Release notes
Sourced from org.rocksdb:rocksdbjni's releases.
... (truncated)
Changelog
Sourced from org.rocksdb:rocksdbjni's changelog.
Commits
6f7cabe
update version.h and HISTORY.md for 9.1.1adb9bf5
Fixmax_successive_merges
counting CPU overhead regression (#12546)7dd5e91
12474 history entrye94141d
Fix exception on RocksDB.getColumnFamilyMetaData() (#12474)bcf88d4
Skip io_uring feature test when building with fbcode (#12525)f6d01f0
Don't swallow errors in BlockBasedTable::MultiGet (#12486)e223cd4
Branch cut 9.1.fbc449867
MultiCfIterator Impl Follow up (#12465)b515a5d
Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470)3b736c4
Fix heap use after free error on retry after checksum mismatch (#12464)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show