speedb-io / speedb

A RocksDB compliant high performance scalable embedded key-value store
https://www.speedb.io/
Apache License 2.0
916 stars 72 forks source link

memtable_list: do not roll back memtable flush on CF drop #144

Closed isaac-io closed 2 years ago

isaac-io commented 2 years ago

When installing memtable flush results, if the flush was unsuccessful (due to an error during the flush or the manifest write) or if the CF was dropped, the memtable is rolled back into a flushable state. However, this is unnecessary since it reactivates the pending flush flag on the relevant CF even though no flush can take place after CF drop, and as in #126 it also causes an issue for the upcoming improvements to the WriteBufferManager immutable memory tracking (#113) because it marks the memtable memory as ready for flush again for a short period before finally getting freed when the memtable is dropped during the destruction of the dropped CF (and this might upset the WBM behaviour because in the future we plan to rely on the amount of memory that isn't already being flushed for triggering flushes rather than on the amount of mutable memory alone as is done today).

This is a continuation of #126 (I simply missed this rollback), and since the rollback only marks the memtable as ready for flush again and clears related flush state, there's no harm in skipping it and just letting the memtable drop in that state.

Yuval-Ariel commented 2 years ago

to introduce cf drop, use flag --clear_column_family_one_in . note this only works with: test_batches_snapshots=false expected_values_dir=None backup_one_in=0 checkpoint_one_in=0

assaf-speedb commented 2 years ago

The white box is failing.

Verification failed for column family 0 key 0000000000038053000000000000004F000000000000003B0000000000000009000000000000004F000000000000003E787878787878 (1638432): Unexpected value found No writes or ops? Verification failed :(

my command: time make db_stress; time CRASH_TEST_EXT_ARGS="--clear_column_family_one_in=5 --test_batches_snapshots=0 --expected_values_dir="" --backup_one_in=0 --checkpoint_one_in=0" make whitebox_crash_test; echo $?

isaac-io commented 2 years ago

Thanks! Can you also post the db_stress command that the crash test executed when this failed?

assaf-speedb commented 2 years ago

Sure. BTW, the black box failed as well - for the same reason:

Whitebox ./db_stress --acquire_snapshot_one_in=10000 --adaptive_readahead=1 --allow_concurrent_memtable_write=1 --async_io=1 --avoid_flush_during_recovery=1 --avoid_unnecessary_blocking_io=1 --backup_max_size=104857600 --backup_one_in=0 --batch_protection_bytes_per_key=0 --block_size=4096 --bloom_bits=10 --bottommost_compression_type=lz4hc --cache_index_and_filter_blocks=0 --cache_size=8388608 --checkpoint_one_in=0 --checksum_type=kCRC32c --clear_column_family_one_in=4 --column_families=1 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_ttl=0 --compare_full_db_state_snapshot=0 --compression_max_dict_buffer_bytes=67108863 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=xpress --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --customopspercent=0 --data_block_hash_table_util_ratio=0.63 --data_block_index_type=1 --db=/dev/shm/rocksdb.j1e0/rocksdb_crashtest_whitebox --db_write_buffer_size=0 --delpercent=1 --delrangepercent=5 --destroy_db_initially=0 --detect_filter_construct_corruption=0 --disable_wal=0 --enable_compaction_filter=0 --enable_pipelined_write=0 --expected_values_dir= --experimental_mempurge_threshold=6.023655548032458 --fail_if_options_file_error=0 --file_checksum_impl=big --flush_one_in=1000000 --format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=100000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=13 --index_type=2 --iterpercent=19 --key_len_percent_dist=14,8,6,1,8,27,11,11,8,6 --kill_exclude_prefixes=WritableFileWriter::Append,WritableFileWriter::WriteBuffered --kill_random_test=88889 --level_compaction_dynamic_level_bytes=False --long_running_snapshots=1 --mark_for_compaction_one_file_in=0 --max_background_compactions=1 --max_bytes_for_level_base=67108864 --max_key=10485760 --max_key_len=10 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=16 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=0 --memtable_prefix_bloom_size_ratio=0.01 --memtable_whole_key_filtering=1 --memtablerep=skip_list --mmap_read=1 --mock_direct_io=False --nooverwritepercent=50 --num_iterations=91 --open_files=500000 --open_metadata_write_fault_one_in=8 --open_read_fault_one_in=32 --open_write_fault_one_in=16 --ops_per_thread=20000000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=0 --pause_background_one_in=1000000 --periodic_compaction_seconds=0 --prefix_size=-1 --prefixpercent=0 --prepopulate_block_cache=1 --progress_reports=0 --read_fault_one_in=1000 --readpercent=23 --recycle_log_file_num=1 --reopen=20 --reserve_table_reader_memory=1 --ribbon_starting_level=3 --secondary_cache_fault_one_in=32 --seed=3062778522 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=104857600 --sst_file_manager_bytes_per_truncate=1048576 --subcompactions=1 --sync=False --sync_fault_injection=False --sync_wal_one_in=100000 --target_file_size_base=16777216 --target_file_size_multiplier=1 --test_batches_snapshots=0 --top_level_index_pinning=2 --unpartitioned_pinning=2 --use_block_based_filter=0 --use_clock_cache=0 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=True --use_merge=0 --use_multiget=1 --user_timestamp_size=0 --value_size_mult=32 --verify_before_write=True --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --wal_compression=none --write_buffer_size=33554432 --write_dbid_to_manifest=0 --writepercent=52

Blackbox ./db_stress --acquire_snapshot_one_in=0 --adaptive_readahead=0 --allow_concurrent_memtable_write=1 --allow_setting_blob_options_dynamically=1 --async_io=0 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=0 --backup_max_size=104857600 --backup_one_in=0 --batch_protection_bytes_per_key=0 --blob_compaction_readahead_size=4194304 --blob_compression_type=lz4 --blob_file_size=268435456 --blob_garbage_collection_age_cutoff=0.5 --blob_garbage_collection_force_threshold=0.75 --block_size=16384 --bloom_bits=13 --bottommost_compression_type=none --cache_index_and_filter_blocks=1 --cache_size=8388608 --checkpoint_one_in=0 --checksum_type=kxxHash64 --clear_column_family_one_in=5 --column_families=1 --compact_files_one_in=1000000 --compact_range_one_in=0 --compaction_ttl=1 --compare_full_db_state_snapshot=0 --compression_max_dict_buffer_bytes=7 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=xpress --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --customopspercent=0 --data_block_hash_table_util_ratio=0.73 --data_block_index_type=1 --db=/dev/shm/rocksdb.qGR3/rocksdb_crashtest_blackbox --db_write_buffer_size=134217728 --delpercent=9 --delrangepercent=4 --destroy_db_initially=0 --detect_filter_construct_corruption=0 --disable_wal=0 --enable_blob_files=1 --enable_blob_garbage_collection=1 --enable_compaction_filter=1 --enable_pipelined_write=0 --expected_values_dir= --experimental_mempurge_threshold=0.3731507821551916 --fail_if_options_file_error=1 --file_checksum_impl=big --flush_one_in=1000000 --format_version=2 --get_current_wal_file_one_in=0 --get_live_files_one_in=100000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=8 --index_type=0 --iterpercent=0 --key_len_percent_dist=2,16,10,3,40,29 --level_compaction_dynamic_level_bytes=False --long_running_snapshots=1 --mark_for_compaction_one_file_in=10 --max_background_compactions=1 --max_bytes_for_level_base=67108864 --max_key=1048576 --max_key_len=6 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=16 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=1048576 --memtable_prefix_bloom_size_ratio=0 --memtable_whole_key_filtering=0 --memtablerep=skip_list --min_blob_size=8 --mmap_read=1 --mock_direct_io=False --nooverwritepercent=0 --num_iterations=77 --open_files=-1 --open_metadata_write_fault_one_in=8 --open_read_fault_one_in=32 --open_write_fault_one_in=0 --ops_per_thread=100000000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=0 --pause_background_one_in=1000000 --periodic_compaction_seconds=0 --prefix_size=-1 --prefixpercent=0 --prepopulate_block_cache=1 --progress_reports=0 --read_fault_one_in=1000 --readpercent=39 --recycle_log_file_num=1 --reopen=0 --reserve_table_reader_memory=0 --ribbon_starting_level=6 --secondary_cache_fault_one_in=32 --seed=3069121425 --set_options_one_in=0 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --subcompactions=3 --sync=False --sync_fault_injection=False --sync_wal_one_in=100000 --target_file_size_base=16777216 --target_file_size_multiplier=1 --test_batches_snapshots=0 --top_level_index_pinning=3 --unpartitioned_pinning=3 --use_block_based_filter=0 --use_clock_cache=0 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=False --use_merge=1 --use_multiget=0 --user_timestamp_size=0 --value_size_mult=32 --verify_before_write=True --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --wal_compression=none --write_buffer_size=33554432 --write_dbid_to_manifest=0 --writepercent=48

isaac-io commented 2 years ago

Pretty sure that it's unrelated to the change in this issue because you have --clear_column_family_one_in=4 with --column_families=1, and in that case no drop is actually performed, so the code changes of this issue aren't reached. I'll try to reproduce locally on this branch and on main.

Guyme commented 2 years ago

@Yuval-Ariel - please run full cycle as it seams to be unrelated to this change

Yuval-Ariel commented 2 years ago

QA passed on c6567a63f624c7d0a00e8461f031ef80528b8f6e