VIDA-NYU / ache

ACHE is a web crawler for domain-specific search.
http://ache.readthedocs.io
Apache License 2.0
449 stars 135 forks source link

Bump rocksdbjni from 6.25.3 to 7.3.1 #284

Closed dependabot[bot] closed 2 years ago

dependabot[bot] commented 2 years ago

Bumps rocksdbjni from 6.25.3 to 7.3.1.

Release notes

Sourced from rocksdbjni's releases.

RocksDB 7.3.1

7.3.1 (2022-06-08)

Bug Fixes

  • Fix a bug in WAL tracking. Before this PR (#10087), calling SyncWAL() on the only WAL file of the db will not log the event in MANIFEST, thus allowing a subsequent DB::Open even if the WAL file is missing or corrupted.
  • Fixed a bug for non-TransactionDB with avoid_flush_during_recovery = true and TransactionDB where in case of crash, min_log_number_to_keep may not change on recovery and persisting a new MANIFEST with advanced log_numbers for some column families, results in "column family inconsistency" error on second recovery. As a solution, RocksDB will persist the new MANIFEST after successfully syncing the new WAL. If a future recovery starts from the new MANIFEST, then it means the new WAL is successfully synced. Due to the sentinel empty write batch at the beginning, kPointInTimeRecovery of WAL is guaranteed to go after this point. If future recovery starts from the old MANIFEST, it means the writing the new MANIFEST failed. We won't have the "SST ahead of WAL" error.
  • Fixed a bug where RocksDB DB::Open() may creates and writes to two new MANIFEST files even before recovery succeeds. Now writes to MANIFEST are persisted only after recovery is successful.

7.3.0 (2022-05-20)

Bug Fixes

  • Fixed a bug where manual flush would block forever even though flush options had wait=false.
  • Fixed a bug where RocksDB could corrupt DBs with avoid_flush_during_recovery == true by removing valid WALs, leading to Status::Corruption with message like "SST file is ahead of WALs" when attempting to reopen.
  • Fixed a bug in async_io path where incorrect length of data is read by FilePrefetchBuffer if data is consumed from two populated buffers and request for more data is sent.
  • Fixed a CompactionFilter bug. Compaction filter used to use Delete to remove keys, even if the keys should be removed with SingleDelete. Mixing Delete and SingleDelete may cause undefined behavior.
  • Fixed a bug in WritableFileWriter::WriteDirect and WritableFileWriter::WriteDirectWithChecksum. The rate_limiter_priority specified in ReadOptions was not passed to the RateLimiter when requesting a token.
  • Fixed a bug which might cause process crash when I/O error happens when reading an index block in MultiGet().

New Features

  • DB::GetLiveFilesStorageInfo is ready for production use.
  • Add new stats PREFETCHED_BYTES_DISCARDED which records number of prefetched bytes discarded by RocksDB FilePrefetchBuffer on destruction and POLL_WAIT_MICROS records wait time for FS::Poll API completion.
  • RemoteCompaction supports table_properties_collector_factories override on compaction worker.
  • Start tracking SST unique id in MANIFEST, which will be used to verify with SST properties during DB open to make sure the SST file is not overwritten or misplaced. A db option verify_sst_unique_id_in_manifest is introduced to enable/disable the verification, if enabled all SST files will be opened during DB-open to verify the unique id (default is false), so it's recommended to use it with max_open_files = -1 to pre-open the files.
  • Added the ability to concurrently read data blocks from multiple files in a level in batched MultiGet. This can be enabled by setting the async_io option in ReadOptions. Using this feature requires a FileSystem that supports ReadAsync (PosixFileSystem is not supported yet for this), and for RocksDB to be compiled with folly and c++20.
  • Add FileSystem::ReadAsync API in io_tracing.

Public API changes

  • Add rollback_deletion_type_callback to TransactionDBOptions so that write-prepared transactions know whether to issue a Delete or SingleDelete to cancel a previous key written during prior prepare phase. The PR aims to prevent mixing SingleDeletes and Deletes for the same key that can lead to undefined behaviors for write-prepared transactions.
  • EXPERIMENTAL: Add new API AbortIO in file_system to abort the read requests submitted asynchronously.
  • CompactionFilter::Decision has a new value: kRemoveWithSingleDelete. If CompactionFilter returns this decision, then CompactionIterator will use SingleDelete to mark a key as removed.
  • Renamed CompactionFilter::Decision::kRemoveWithSingleDelete to kPurge since the latter sounds more general and hides the implementation details of how compaction iterator handles keys.
  • Added ability to specify functions for Prepare and Validate to OptionsTypeInfo. Added methods to OptionTypeInfo to set the functions via an API. These methods are intended for RocksDB plugin developers for configuration management.
  • Added a new immutable db options, enforce_single_del_contracts. If set to false (default is true), compaction will NOT fail due to a single delete followed by a delete for the same key. The purpose of this temporay option is to help existing use cases migrate.
  • Introduce BlockBasedTableOptions::cache_usage_options and use that to replace BlockBasedTableOptions::reserve_table_builder_memory and BlockBasedTableOptions::reserve_table_reader_memory.
  • Changed GetUniqueIdFromTableProperties to return a 128-bit unique identifier, which will be the standard size now. The old functionality (192-bit) is available from GetExtendedUniqueIdFromTableProperties. Both functions are no longer "experimental" and are ready for production use.
  • In IOOptions, mark prio as deprecated for future removal.
  • In file_system.h, mark IOPriority as deprecated for future removal.
  • Add an option, CompressionOptions::use_zstd_dict_trainer, to indicate whether zstd dictionary trainer should be used for generating zstd compression dictionaries. The default value of this option is true for backward compatibility. When this option is set to false, zstd API ZDICT_finalizeDictionary is used to generate compression dictionaries.
  • Seek API which positions itself every LevelIterator on the correct data block in the correct SST file which can be parallelized if ReadOptions.async_io option is enabled.
  • Add new stat number_async_seek in PerfContext that indicates number of async calls made by seek to prefetch data.

Bug Fixes

  • RocksDB calls FileSystem::Poll API during FilePrefetchBuffer destruction which impacts performance as it waits for read requets completion which is not needed anymore. Calling FileSystem::AbortIO to abort those requests instead fixes that performance issue.
  • Fixed unnecessary block cache contention when queries within a MultiGet batch and across parallel batches access the same data block, which previously could cause severely degraded performance in this unusual case. (In more typical MultiGet cases, this fix is expected to yield a small or negligible performance improvement.)

Behavior changes

  • Enforce the existing contract of SingleDelete so that SingleDelete cannot be mixed with Delete because it leads to undefined behavior. Fix a number of unit tests that violate the contract but happen to pass.
  • ldb --try_load_options default to true if --db is specified and not creating a new DB, the user can still explicitly disable that by --try_load_options=false (or explicitly enable that by --try_load_options).
  • During Flush write or Compaction write/read, the WriteController is used to determine whether DB writes are stalled or slowed down. The priority (Env::IOPriority) can then be determined accordingly and be passed in IOOptions to the file system.

RocksDB 7.2.2

7.2.2 (2022-04-28)

... (truncated)

Changelog

Sourced from rocksdbjni's changelog.

7.3.1 (06/08/2022)

Bug Fixes

  • Fix a bug in WAL tracking. Before this PR (#10087), calling SyncWAL() on the only WAL file of the db will not log the event in MANIFEST, thus allowing a subsequent DB::Open even if the WAL file is missing or corrupted.
  • Fixed a bug for non-TransactionDB with avoid_flush_during_recovery = true and TransactionDB where in case of crash, min_log_number_to_keep may not change on recovery and persisting a new MANIFEST with advanced log_numbers for some column families, results in "column family inconsistency" error on second recovery. As a solution, RocksDB will persist the new MANIFEST after successfully syncing the new WAL. If a future recovery starts from the new MANIFEST, then it means the new WAL is successfully synced. Due to the sentinel empty write batch at the beginning, kPointInTimeRecovery of WAL is guaranteed to go after this point. If future recovery starts from the old MANIFEST, it means the writing the new MANIFEST failed. We won't have the "SST ahead of WAL" error.
  • Fixed a bug where RocksDB DB::Open() may creates and writes to two new MANIFEST files even before recovery succeeds. Now writes to MANIFEST are persisted only after recovery is successful.

7.3.0 (05/20/2022)

Bug Fixes

  • Fixed a bug where manual flush would block forever even though flush options had wait=false.
  • Fixed a bug where RocksDB could corrupt DBs with avoid_flush_during_recovery == true by removing valid WALs, leading to Status::Corruption with message like "SST file is ahead of WALs" when attempting to reopen.
  • Fixed a bug in async_io path where incorrect length of data is read by FilePrefetchBuffer if data is consumed from two populated buffers and request for more data is sent.
  • Fixed a CompactionFilter bug. Compaction filter used to use Delete to remove keys, even if the keys should be removed with SingleDelete. Mixing Delete and SingleDelete may cause undefined behavior.
  • Fixed a bug in WritableFileWriter::WriteDirect and WritableFileWriter::WriteDirectWithChecksum. The rate_limiter_priority specified in ReadOptions was not passed to the RateLimiter when requesting a token.
  • Fixed a bug which might cause process crash when I/O error happens when reading an index block in MultiGet().

New Features

  • DB::GetLiveFilesStorageInfo is ready for production use.
  • Add new stats PREFETCHED_BYTES_DISCARDED which records number of prefetched bytes discarded by RocksDB FilePrefetchBuffer on destruction and POLL_WAIT_MICROS records wait time for FS::Poll API completion.
  • RemoteCompaction supports table_properties_collector_factories override on compaction worker.
  • Start tracking SST unique id in MANIFEST, which will be used to verify with SST properties during DB open to make sure the SST file is not overwritten or misplaced. A db option verify_sst_unique_id_in_manifest is introduced to enable/disable the verification, if enabled all SST files will be opened during DB-open to verify the unique id (default is false), so it's recommended to use it with max_open_files = -1 to pre-open the files.
  • Added the ability to concurrently read data blocks from multiple files in a level in batched MultiGet. This can be enabled by setting the async_io option in ReadOptions. Using this feature requires a FileSystem that supports ReadAsync (PosixFileSystem is not supported yet for this), and for RocksDB to be compiled with folly and c++20.
  • Add FileSystem::ReadAsync API in io_tracing.

Public API changes

  • Add rollback_deletion_type_callback to TransactionDBOptions so that write-prepared transactions know whether to issue a Delete or SingleDelete to cancel a previous key written during prior prepare phase. The PR aims to prevent mixing SingleDeletes and Deletes for the same key that can lead to undefined behaviors for write-prepared transactions.
  • EXPERIMENTAL: Add new API AbortIO in file_system to abort the read requests submitted asynchronously.
  • CompactionFilter::Decision has a new value: kRemoveWithSingleDelete. If CompactionFilter returns this decision, then CompactionIterator will use SingleDelete to mark a key as removed.
  • Renamed CompactionFilter::Decision::kRemoveWithSingleDelete to kPurge since the latter sounds more general and hides the implementation details of how compaction iterator handles keys.
  • Added ability to specify functions for Prepare and Validate to OptionsTypeInfo. Added methods to OptionTypeInfo to set the functions via an API. These methods are intended for RocksDB plugin developers for configuration management.
  • Added a new immutable db options, enforce_single_del_contracts. If set to false (default is true), compaction will NOT fail due to a single delete followed by a delete for the same key. The purpose of this temporay option is to help existing use cases migrate.
  • Introduce BlockBasedTableOptions::cache_usage_options and use that to replace BlockBasedTableOptions::reserve_table_builder_memory and BlockBasedTableOptions::reserve_table_reader_memory.
  • Changed GetUniqueIdFromTableProperties to return a 128-bit unique identifier, which will be the standard size now. The old functionality (192-bit) is available from GetExtendedUniqueIdFromTableProperties. Both functions are no longer "experimental" and are ready for production use.
  • In IOOptions, mark prio as deprecated for future removal.
  • In file_system.h, mark IOPriority as deprecated for future removal.
  • Add an option, CompressionOptions::use_zstd_dict_trainer, to indicate whether zstd dictionary trainer should be used for generating zstd compression dictionaries. The default value of this option is true for backward compatibility. When this option is set to false, zstd API ZDICT_finalizeDictionary is used to generate compression dictionaries.
  • Seek API which positions itself every LevelIterator on the correct data block in the correct SST file which can be parallelized if ReadOptions.async_io option is enabled.
  • Add new stat number_async_seek in PerfContext that indicates number of async calls made by seek to prefetch data.

Bug Fixes

  • RocksDB calls FileSystem::Poll API during FilePrefetchBuffer destruction which impacts performance as it waits for read requets completion which is not needed anymore. Calling FileSystem::AbortIO to abort those requests instead fixes that performance issue.
  • Fixed unnecessary block cache contention when queries within a MultiGet batch and across parallel batches access the same data block, which previously could cause severely degraded performance in this unusual case. (In more typical MultiGet cases, this fix is expected to yield a small or negligible performance improvement.)

Behavior changes

  • Enforce the existing contract of SingleDelete so that SingleDelete cannot be mixed with Delete because it leads to undefined behavior. Fix a number of unit tests that violate the contract but happen to pass.
  • ldb --try_load_options default to true if --db is specified and not creating a new DB, the user can still explicitly disable that by --try_load_options=false (or explicitly enable that by --try_load_options).
  • During Flush write or Compaction write/read, the WriteController is used to determine whether DB writes are stalled or slowed down. The priority (Env::IOPriority) can then be determined accordingly and be passed in IOOptions to the file system.

7.2.0 (04/15/2022)

Bug Fixes

... (truncated)

Commits
  • 8e0f495 Update version
  • 41fe221 Update History.md for #9922 (#10092)
  • 405a35f Persist the new MANIFEST after successfully syncing the new WAL during recove...
  • 8244f13 Fix a bug in WAL tracking (#10087)
  • c8bae6e Provide support for IOTracing for ReadAsync API (#9833)
  • 07a0082 Fix potential ambiguities in/around port/sys_time.h (#10045)
  • f80bac5 Fix fbcode internal build failure (#10041)
  • a479c2c Fix stress test failure "Corruption: checksum mismatch" or "Iterator Diverged...
  • bea5831 Move three info logging within DB Mutex to use log buffer (#10029)
  • 1e4850f Java build: finish compiling before testing (etc) (#10034)
  • Additional commits viewable in compare view


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
dependabot[bot] commented 2 years ago

Superseded by #293.