facebook / CacheLib

Pluggable in-process caching engine to build and scale high performance services
https://www.cachelib.org
Apache License 2.0
1.18k stars 254 forks source link

Enable FDP for CacheBench #302

Open jmhands opened 5 months ago

jmhands commented 5 months ago

Reading the commit notes from https://github.com/facebook/CacheLib/commit/009e89ba2b49b1fbbc48d03c3f81046de28bd6ed

I tried to enable FDP by adding

    "enableFDP": true,
    "navyEnableIoUring": true,
    "navyQDepth": 1,

to the following. The Samsung docs I was following to enable FDP say to add "devicePlacement": true,. Which one is it?

but when I run CacheBench I see

I0402 22:15:30.163209 35634 Cache-inl.h:240]   "navyConfig::enableFDP": "0", 

and it fails at

F0402 22:15:30.316447 35634 Device.cpp:761] Check failed: !useIoUring_ && !(fdpNvmeVec_.size() > 0)                                                     *** Aborted at 1712096130 (Unix time, try 'date -d @1712096130') ***                                                                                    *** Signal 6 (SIGABRT) (0x8b32) received by PID 35634 (pthread TID 0x7aef680424c0) (linux TID 35634) (maybe from PID 35634, UID 0) (code: -6), stack trace: ***                                                                                                                                                     @ 0000000000d3bd3e folly::symbolizer::(anonymous namespace)::innerSignalHandler(int, siginfo_t*, void*)                                                                    /home/jm/CacheLib/cachelib/external/folly/folly/experimental/symbolizer/SignalHandler.cpp:449                                        @ 0000000000d3be24 folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)                                                                         /home/jm/CacheLib/cachelib/external/folly/folly/experimental/symbolizer/SignalHandler.cpp:470                                        @ 000000000004251f (unknown)                                                                                                                            @ 00000000000969fc pthread_kill                                                                                                                         @ 0000000000042475 raise                                                                                                                                @ 00000000000287f2 abort  
{
    "cache_config": {
      "cacheSizeMB": 20000,
      "cacheDir": "/tmp/cachelib_metadata",
      "allocFactor": 1.08,
      "maxAllocSize": 524288,
      "minAllocSize": 64,
      "navyReaderThreads": 24,
      "navyWriterThreads": 12,
      "nvmCachePaths": ["/dev/ng0n1"],
      "nvmCacheSizeMB": 2666496,
      "writeAmpDeviceList": ["nvme0n1"],
      "navyBigHashBucketSize": 4096,
      "navyBigHashSizePct": 0,
      "navySmallItemMaxSize": 640,
      "navySegmentedFifoSegmentRatio": [1.0],
      "navyHitsReinsertionThreshold": 1,
      "navyBlockSize": 4096,
      "nvmAdmissionRetentionTimeThreshold": 7200,
      "navyParcelMemoryMB": 6048,
      "enableChainedItem": true,
      "htBucketPower": 29,
      "moveOnSlabRelease": false,
      "poolRebalanceIntervalSec": 2,
      "poolResizeIntervalSec": 2,
      "rebalanceStrategy": "hits"
    },
    "test_config": {
      "opRatePerSec": 1000000,
      "opRateBurstSize": 200,
      "enableLookaside": false,
      "generator": "replay",
      "replayGeneratorConfig": {
        "ampFactor": 200
      },
      "repeatTraceReplay": true,
      "repeatOpCount": true,
      "onlySetIfMiss": false,
      "numOps": 100000000000,
      "numThreads": 10,
      "prepopulateCache": true,
      "traceFileNames": [
        "kvcache_traces_1.csv",
        "kvcache_traces_2.csv",
        "kvcache_traces_3.csv",
        "kvcache_traces_4.csv",
        "kvcache_traces_5.csv"
      ]
    }
  }
jaesoo-fb commented 5 months ago

Hi @jmhands

For FDP, you still need to provide the nvme blkdev path for nvmCachePaths in the standard format like /dev/nvmeXnY[pZ]. The char device name is derived from the blkdev name as can be seen here

If this still does not fix the issue, please upload the full log files. Thanks.

jmhands commented 5 months ago

Hi @jmhands

For FDP, you still need to provide the nvme blkdev path for nvmCachePaths in the standard format like /dev/nvmeXnY[pZ]. The char device name is derived from the blkdev name as can be seen here

If this still does not fix the issue, please upload the full log files. Thanks.

yes this was just the example in the path, I edited for /dev/nvme0n1 which is drive with FDP enabled, which I can verify with nvme id-ctrl /dev/nvme0n1

here is the log

I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::enableFDP": "0",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::fileName": "/dev/nvme0n1",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::fileSize": "998579896320",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::ioEngine": "io_uring",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::maxConcurrentInserts": "1000000",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::maxNumReads": "0",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::maxNumWrites": "0",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::maxParcelMemoryMB": "6048",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::maxWriteRate": "0",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::navyReqOrderingShards": "21",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::raidPaths": "",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::readerThreads": "72",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::stackSize": "16384",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::truncateFile": "false",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::writerThreads": "36"
I0403 18:25:40.640346 39787 Cache-inl.h:240] }
I0403 18:25:40.640600 39787 Cache-inl.h:279] Failed to attach for reason: Unable to find any segment with name shm_info
E0403 18:25:41.550093 39787 NvmCacheState.cpp:135] unable to deserialize nvm metadata file: no content in file: /root/cachelib_metadata/NvmCacheState
I0403 18:25:41.554360 39787 Device.cpp:1080] Cache file: /dev/nvme0n1 size: 998579896320 truncate: 0
I0403 18:25:41.554422 39787 Device.cpp:965] Created device with num_devices 1 size 998579896320 block_size 4096,stripe_size 0 max_write_size 1048576 max_i
o_size 1048576 io_engine io_uring qdepth 1,num_fdp_devices 0
I0403 18:25:41.617358 39787 NavySetup.cpp:243] metadataSize: 4992897024
I0403 18:25:41.617374 39787 NavySetup.cpp:245] Setting up engine pair 0
I0403 18:25:41.617384 39787 NavySetup.cpp:111] bighashStartingLimit: 4992897024 bigHashCacheOffset: 958636703744 bigHashCacheSize: 39943192576
I0403 18:25:41.617389 39787 NavySetup.cpp:259] blockCacheSize 953643806720

and I should have FDP enabled here

"enableFDP": true,
    "navyEnableIoUring": true,
    "navyQDepth": 1,
jaesoo-fb commented 5 months ago

@jmhands Actually, the name of the config is deviceEnableFDP

jmhands commented 5 months ago

@jmhands Actually, the name of the config is deviceEnableFDP

getting closer! That worked for enabling FDP but I'm getting an abort after the 30 seconds of runtime I specified


===JSON Config===
{
  "cache_config":
  {
    "cacheSizeMB": 43000,
    "cacheDir": "/root/cachelib_metadata",
    "allocFactor": 1.08,
    "maxAllocSize": 524288,
    "minAllocSize": 64,
    "navyReaderThreads": 72,
    "navyWriterThreads": 36,
    "nvmCachePaths": ["/dev/nvme0n1"],
    "nvmCacheSizeMB" : 952320,
    "writeAmpDeviceList": ["nvme0n1"],
    "navyBigHashBucketSize": 4096,
    "navyBigHashSizePct": 4,
    "navySmallItemMaxSize": 640,
    "navySegmentedFifoSegmentRatio": [1.0],
    "navyHitsReinsertionThreshold": 1,
    "navyBlockSize": 4096,
    "nvmAdmissionRetentionTimeThreshold": 7200,
    "navyParcelMemoryMB": 6048,
    "enableChainedItem": true,
    "deviceEnableFDP": true,
    "navyEnableIoUring": true,
    "navyQDepth": 1,
    "htBucketPower": 29,
    "moveOnSlabRelease": false,
    "poolRebalanceIntervalSec": 2,
    "poolResizeIntervalSec": 2,
    "rebalanceStrategy": "hits"
  },
  "test_config":
  {
    "opRatePerSec": 550000,
    "opRateBurstSize": 200,
    "enableLookaside": false,
    "generator": "replay",
    "replayGeneratorConfig":
    {
        "ampFactor": 100
    },
    "repeatTraceReplay": true,
    "repeatOpCount" : true,
    "onlySetIfMiss" : false,
    "numOps": 100000000000,
    "numThreads": 10,
    "prepopulateCache": true,
    "traceFileNames": [
        "kvcache_traces_1.csv",
        "kvcache_traces_2.csv",
        "kvcache_traces_3.csv",
        "kvcache_traces_4.csv",
        "kvcache_traces_5.csv"
    ]
  }
}

Welcome to OSS version of cachebench
I0403 20:17:21.918377 41022 KVReplayGenerator.h:106] Started KVReplayGenerator (amp factor 100, # of stressor threads 10)
I0403 20:17:21.918377 41023 ReplayGeneratorBase.h:218] [0] Opened trace file kvcache_traces_1.csv
I0403 20:17:21.918590 41023 ReplayGeneratorBase.h:179] New header detected: header "op_time,key,key_size,op,op_count,size,cache_hits,ttl" field map key -> 1, op -> 3, size -> 5, op_count -> 4, key_size -> 2, ttl -> 7, op_time -> 0, cache_hits -> 6
E0403 20:17:22.163842 41022 Cache-inl.h:27] Exception fetching nand writes for nvme0n1. Msg: Vendor not recogized in device model number fadu echo e1.s 3.84tb
I0403 20:17:22.164012 41022 Cache-inl.h:151] Configuring NVM cache: simple file /dev/nvme0n1 size 952320 MB
I0403 20:17:22.164229 41022 Cache-inl.h:240] Using the following nvm config{
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::QDepth": "1",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::admissionPolicy": "",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::admissionProbBaseSize": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::admissionProbFactorLowerBound": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::admissionProbFactorUpperBound": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::admissionProbability": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::admissionSuffixLen": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::admissionWriteRate": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::bigHashBucketBfSize": "8",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::bigHashBucketSize": "4096",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::bigHashSizePct": "4",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::bigHashSmallItemMaxSize": "640",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheCleanRegionThreads": "1",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheCleanRegions": "1",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheDataChecksum": "true",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheLru": "false",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheNumInMemBuffers": "2",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheRegionSize": "16777216",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheReinsertionHitsThreshold": "1",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheReinsertionPctThreshold": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheSegmentedFifoSegmentRatio": "",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockSize": "4096",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::deviceMaxWriteSize": "1048576",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::deviceMetadataSize": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::enableFDP": "1",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::fileName": "/dev/nvme0n1",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::fileSize": "998579896320",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::ioEngine": "io_uring",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::maxConcurrentInserts": "1000000",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::maxNumReads": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::maxNumWrites": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::maxParcelMemoryMB": "6048",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::maxWriteRate": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::navyReqOrderingShards": "21",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::raidPaths": "",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::readerThreads": "72",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::stackSize": "16384",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::truncateFile": "false",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::writerThreads": "36"
I0403 20:17:22.164229 41022 Cache-inl.h:240] }
I0403 20:17:22.164481 41022 Cache-inl.h:279] Failed to attach for reason: Unable to find any segment with name shm_info
E0403 20:17:23.076967 41022 NvmCacheState.cpp:135] unable to deserialize nvm metadata file: no content in file: /root/cachelib_metadata/NvmCacheState
I0403 20:17:23.081306 41022 Device.cpp:1080] Cache file: /dev/nvme0n1 size: 998579896320 truncate: 0
I0403 20:17:23.081369 41022 Device.cpp:965] Created device with num_devices 1 size 998579896320 block_size 4096,stripe_size 0 max_write_size 1048576 max_io_size 1048576 io_engine io_uring qdepth 1,num_fdp_devices 0
I0403 20:17:23.144655 41022 NavySetup.cpp:243] metadataSize: 4992897024
I0403 20:17:23.144669 41022 NavySetup.cpp:245] Setting up engine pair 0
I0403 20:17:23.144679 41022 NavySetup.cpp:111] bighashStartingLimit: 4992897024 bigHashCacheOffset: 958636703744 bigHashCacheSize: 39943192576
I0403 20:17:23.144685 41022 NavySetup.cpp:259] blockCacheSize 953643806720
I0403 20:17:23.144690 41022 NavySetup.cpp:156] blockcache: starting offset: 4992897024, block cache size: 953633734656
I0403 20:17:23.144709 41022 FifoPolicy.cpp:37] FIFO policy
I0403 20:17:23.170003 41022 BigHash.cpp:93] BigHash created: buckets: 9751756, bucket size: 4096, base offset: 958636703744
I0403 20:17:23.170014 41022 BigHash.cpp:102] Reset BigHash
I0403 20:17:23.178391 41022 RegionManager.cpp:50] 56841 regions, 16777216 bytes each
I0403 20:17:23.190929 41138 RegionManager.cpp:68] region_manager_0 started
I0403 20:17:23.210001 41022 Allocator.cpp:39] Enable priority-based allocation for Allocator. Number of priorities: 1
I0403 20:17:23.210046 41022 BlockCache.cpp:145] Block cache created
I0403 20:17:23.210182 41022 Driver.cpp:70] Max concurrent inserts: 1000000
I0403 20:17:23.210189 41022 Driver.cpp:71] Max parcel memory: 6341787648
I0403 20:17:23.210196 41022 Driver.cpp:72] Use Write Estimated Size: false
I0403 20:17:23.210205 41022 Driver.cpp:209] Reset Navy
I0403 20:17:23.210222 41022 BigHash.cpp:102] Reset BigHash
I0403 20:17:23.214526 41022 BlockCache.cpp:705] Reset block cache
Total 1000000.00M ops to be run
E0403 20:17:23.504747 41143 Cache-inl.h:27] Exception fetching nand writes for nvme0n1. Msg: Vendor not recogized in device model number fadu echo e1.s 3.84tb
20:17:23       0.00M ops completed. Hit Ratio   0.00% (RAM   0.00%, NVM   0.00%)
I0403 20:17:53.281225 41141 main.cpp:92] Stopping due to timeout 30 seconds
E0403 20:17:53.534372 41022 Cache-inl.h:27] Exception fetching nand writes for nvme0n1. Msg: Vendor not recogized in device model number fadu echo e1.s 3.84tb
== Test Results ==
== Allocator Stats ==
Items in RAM  : 1,111,979
Items in NVM  : 0
Alloc Attempts: 1,529,370 Success: 100.00%
Evict Attempts: 0 Success: 0.00%
RAM Evictions : 0
Fraction of pool 0 used : 0.04
Cache Gets    : 14,895,979
Hit Ratio     :  12.06%
RAM Hit Ratio :  12.04%
NVM Hit Ratio :   0.02%
RAM eviction rejects expiry : 0
RAM eviction rejects clean : 0
NVM Read  Latency    p50      :       0.00 us
NVM Read  Latency    p90      :       0.00 us
NVM Read  Latency    p99      :       0.00 us
NVM Read  Latency    p999     :       0.00 us
NVM Read  Latency    p9999    :       0.00 us
NVM Read  Latency    p99999   :       0.00 us
NVM Read  Latency    p999999  :       0.00 us
NVM Read  Latency    p100     :       0.00 us
NVM Write Latency    p50      :       0.00 us
NVM Write Latency    p90      :       0.00 us
NVM Write Latency    p99      :       0.00 us
NVM Write Latency    p999     :       0.00 us
NVM Write Latency    p9999    :       0.00 us
NVM Write Latency    p99999   :       0.00 us
NVM Write Latency    p999999  :       0.00 us
NVM Write Latency    p100     :       0.00 us
NVM bytes written (physical)  :   0.00 GB
NVM bytes written (logical)   :   0.00 GB
NVM bytes written (nand)      :   0.00 GB
NVM app write amplification   :   0.00
NVM dev write amplification   :   0.00
NVM Gets      :      13,102,692, Coalesced :   0.00%
NVM Puts      :               0, Success   : 100.00%, Clean   :   0.00%, AbortsFromDel   :        0, AbortsFromGet   :        0
NVM Evicts    :               0, Clean     :   0.00%, Unclean :       0, Double          :        0
NVM Deletes   :       1,253,922 Skipped Deletes: 100.00%

== Throughput for  ==
Total Ops : 16.52 million
Total sets: 1,529,370
get       :   496,024/s, success   :  12.04%
couldExist:         0/s, success   :   0.00%
set       :    50,926/s, success   : 100.00%
del       :     3,204/s, found     :   2.70%

== KVReplayGenerator Stats ==
Total Processed Samples: 0.08 million (parse error: 0)

I0403 20:17:53.534984 41022 BigHash.cpp:514] Flush big hash
I0403 20:17:53.535011 41022 BlockCache.cpp:699] Flush block cache
I0403 20:17:53.535034 41022 BlockCache.cpp:793] Starting block cache persist
F0403 20:17:53.654948 41022 Device.cpp:761] Check failed: !useIoUring_ && !(fdpNvmeVec_.size() > 0)
*** Aborted at 1712175473 (Unix time, try 'date -d @1712175473') ***
*** Signal 6 (SIGABRT) (0xa03e) received by PID 41022 (pthread TID 0x7fc82f0424c0) (linux TID 41022) (maybe from PID 41022, UID 0) (code: -6), stack trace: ***
    @ 0000000000d3bd3e folly::symbolizer::(anonymous namespace)::innerSignalHandler(int, siginfo_t*, void*)
                       /home/jm/CacheLib/cachelib/external/folly/folly/experimental/symbolizer/SignalHandler.cpp:449
    @ 0000000000d3be24 folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
                       /home/jm/CacheLib/cachelib/external/folly/folly/experimental/symbolizer/SignalHandler.cpp:470
    @ 000000000004251f (unknown)
    @ 00000000000969fc pthread_kill
    @ 0000000000042475 raise
    @ 00000000000287f2 abort
    @ 0000000000f3c23f folly::LogCategory::admitMessage(folly::LogMessage const&) const
                       /home/jm/CacheLib/cachelib/external/folly/folly/logging/LogCategory.cpp:71
    @ 0000000000f5ea4c folly::LogStreamProcessor::logNow()
                       /home/jm/CacheLib/cachelib/external/folly/folly/logging/LogStreamProcessor.cpp:190
    @ 0000000000f5ebbd folly::LogStreamVoidify<true>::operator&(std::ostream&)
                       /home/jm/CacheLib/cachelib/external/folly/folly/logging/LogStreamProcessor.cpp:222
    @ 00000000008aed77 facebook::cachelib::navy::(anonymous namespace)::AsyncIoContext::AsyncIoContext(std::unique_ptr<folly::AsyncBase, std::default_delete<folly::AsyncBase> >&&, unsigned long, folly::EventBase*, unsigned long, bool, std::vector<std::shared_ptr<facebook::cachelib::navy::FdpNvme>, std::allocator<std::shared_ptr<facebook::cachelib::navy::FdpNvme> > >)
                       /home/jm/CacheLib/cachelib/navy/common/Device.cpp:761
    @ 00000000008b3c08 facebook::cachelib::navy::(anonymous namespace)::FileDevice::getIoContext()
                       /home/jm/CacheLib/cachelib/navy/common/Device.cpp:1049
    @ 00000000008b30f2 facebook::cachelib::navy::(anonymous namespace)::FileDevice::writeImpl(unsigned long, unsigned int, void const*, int)
                       /home/jm/CacheLib/cachelib/navy/common/Device.cpp:986
    @ 00000000008aa3cd facebook::cachelib::navy::Device::writeInternal(unsigned long, unsigned char const*, unsigned long, int)
                       /home/jm/CacheLib/cachelib/navy/common/Device.cpp:424
    @ 00000000008a9c6c facebook::cachelib::navy::Device::write(unsigned long, facebook::cachelib::navy::Buffer, int)
                       /home/jm/CacheLib/cachelib/navy/common/Device.cpp:408
    @ 00000000009d478a facebook::cachelib::navy::(anonymous namespace)::DeviceMetaDataWriter::writeRecord(std::unique_ptr<folly::IOBuf, std::default_delete<folly::IOBuf> >)::{lambda()#1}::operator()()
                       /home/jm/CacheLib/cachelib/navy/serialization/RecordIO.cpp:114
    @ 00000000009d49e2 facebook::cachelib::navy::(anonymous namespace)::DeviceMetaDataWriter::writeRecord(std::unique_ptr<folly::IOBuf, std::default_delete<folly::IOBuf> >)
                       /home/jm/CacheLib/cachelib/navy/serialization/RecordIO.cpp:126
    @ 00000000009ad7a0 void facebook::cachelib::serializeProto<facebook::cachelib::navy::serialization::RegionData, apache::thrift::Serializer<apache::thrift::BinaryProtocolReader, apache::thrift::BinaryProtocolWriter> >(facebook::cachelib::navy::serialization::RegionData const&, facebook::cachelib::RecordWriter&)
                       /home/jm/CacheLib/cachelib/../cachelib/common/Serialization.h:191
                       -> /home/jm/CacheLib/cachelib/navy/block_cache/RegionManager.cpp
    @ 00000000009ab896 void facebook::cachelib::navy::serializeProto<facebook::cachelib::navy::serialization::RegionData>(facebook::cachelib::navy::serialization::RegionData const&, facebook::cachelib::RecordWriter&)
                       /home/jm/CacheLib/cachelib/../cachelib/navy/serialization/Serialization.h:32
                       -> /home/jm/CacheLib/cachelib/navy/block_cache/RegionManager.cpp
    @ 00000000009a3dc0 facebook::cachelib::navy::RegionManager::persist(facebook::cachelib::RecordWriter&) const
                       /home/jm/CacheLib/cachelib/navy/block_cache/RegionManager.cpp:466
    @ 000000000097f3f2 facebook::cachelib::navy::BlockCache::persist(facebook::cachelib::RecordWriter&)
                       /home/jm/CacheLib/cachelib/navy/block_cache/BlockCache.cpp:801
    @ 00000000009ceda3 facebook::cachelib::navy::EnginePair::persist(facebook::cachelib::RecordWriter&) const
                       /home/jm/CacheLib/cachelib/navy/engine/EnginePair.cpp:255
    @ 00000000009c9741 facebook::cachelib::navy::Driver::persist() const
                       /home/jm/CacheLib/cachelib/navy/driver/Driver.cpp:223
    @ 00000000003cc5f8 facebook::cachelib::NvmCache<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> >::shutDown()
                       /home/jm/CacheLib/cachelib/../cachelib/allocator/nvmcache/NvmCache-inl.h:883
                       -> /home/jm/CacheLib/cachelib/allocator/CacheAllocator.cpp
    @ 00000000003161a9 facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait>::saveNvmCache()
                       /home/jm/CacheLib/cachelib/../cachelib/allocator/CacheAllocator-inl.h:3046
                       -> /home/jm/CacheLib/cachelib/allocator/CacheAllocator.cpp
    @ 00000000002fc2b3 facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait>::shutDown()
                       /home/jm/CacheLib/cachelib/../cachelib/allocator/CacheAllocator-inl.h:3007
                       -> /home/jm/CacheLib/cachelib/allocator/CacheAllocator.cpp
    @ 00000000001d7e1e facebook::cachelib::cachebench::Cache<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> >::~Cache()
                       /home/jm/CacheLib/cachelib/../cachelib/cachebench/cache/Cache-inl.h:319
                       -> /home/jm/CacheLib/cachelib/cachebench/runner/Stressor.cpp
    @ 00000000001c039d std::default_delete<facebook::cachelib::cachebench::Cache<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> > >::operator()(facebook::cachelib::cachebench::Cache<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> >*) const
                       /usr/include/c++/11/bits/unique_ptr.h:85
                       -> /home/jm/CacheLib/cachelib/cachebench/runner/Stressor.cpp
    @ 00000000001aff8d std::unique_ptr<facebook::cachelib::cachebench::Cache<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> >, std::default_delete<facebook::cachelib::cachebench::Cache<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> > > >::~unique_ptr()
                       /usr/include/c++/11/bits/unique_ptr.h:361
                       -> /home/jm/CacheLib/cachelib/cachebench/runner/Stressor.cpp
    @ 00000000001b3a41 facebook::cachelib::cachebench::CacheStressor<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> >::~CacheStressor()
                       /home/jm/CacheLib/cachelib/../cachelib/cachebench/runner/CacheStressor.h:134
                       -> /home/jm/CacheLib/cachelib/cachebench/runner/Stressor.cpp
    @ 00000000001b3ac5 facebook::cachelib::cachebench::CacheStressor<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> >::~CacheStressor()
                       /home/jm/CacheLib/cachelib/../cachelib/cachebench/runner/CacheStressor.h:134
                       -> /home/jm/CacheLib/cachelib/cachebench/runner/Stressor.cpp
    @ 0000000000158ff9 std::default_delete<facebook::cachelib::cachebench::Stressor>::operator()(facebook::cachelib::cachebench::Stressor*) const
                       /usr/include/c++/11/bits/unique_ptr.h:85
                       -> /home/jm/CacheLib/cachelib/cachebench/main.cpp
    @ 00000000001638ad std::__uniq_ptr_impl<facebook::cachelib::cachebench::Stressor, std::default_delete<facebook::cachelib::cachebench::Stressor> >::reset(facebook::cachelib::cachebench::Stressor*)
                       /usr/include/c++/11/bits/unique_ptr.h:182
                       -> /home/jm/CacheLib/cachelib/cachebench/runner/Runner.cpp
    @ 00000000001619ee std::unique_ptr<facebook::cachelib::cachebench::Stressor, std::default_delete<facebook::cachelib::cachebench::Stressor> >::reset(facebook::cachelib::cachebench::Stressor*)
                       /usr/include/c++/11/bits/unique_ptr.h:456
                       -> /home/jm/CacheLib/cachelib/cachebench/runner/Runner.cpp
    @ 000000000015a902 facebook::cachelib::cachebench::Runner::run(std::chrono::duration<long, std::ratio<1l, 1l> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
                       /home/jm/CacheLib/cachelib/cachebench/runner/Runner.cpp:54
    @ 000000000014f5ea main
                       /home/jm/CacheLib/cachelib/cachebench/main.cpp:159
    @ 0000000000029d8f (unknown)
    @ 0000000000029e3f __libc_start_main
    @ 000000000014e764 _start
Aborted```
arungeorge83 commented 5 months ago

It is a case liburing library (which the FDP support is dependent on for now) is not available on the system, and the cachelib build system chose not to install it for some reason. Try installing the liburing by 'yum install liburing' or by downloading from https://github.com/axboe/liburing. Btw, which kernel version are you using? The iouring passthru support needs kernel of 6.1.x at least.

jmhands commented 5 months ago

I'm using Ubuntu 22.04.4 LTS with HWE kernel, 6.5.0-26-generic. io_uring works fine in fio, etc. installing sudo apt install liburing-dev or sudo apt install liburing2 don't help

jaesoo-fb commented 5 months ago

@jmhands sudo apt-get install liburing-dev should have enabled the io_uring and CACHELIB_IOURING_DISABLE should not be defined here. FYI, this is the cmake rule to find the required io_uring libraries and headers.

Please take a look at the cmake output for cachelib for any issues.

jmhands commented 5 months ago

after installing liburing-dev and running the ./contrib/build.sh -d -j -v it fails at compiling Folly


      |                      IORING_SETUP_CQSIZE
make[2]: *** [CMakeFiles/folly_base.dir/build.make:1714: CMakeFiles/folly_base.dir/folly/experimental/io/IoUring.cpp.o] Error 1
make[2]: *** [CMakeFiles/folly_base.dir/build.make:1770: CMakeFiles/folly_base.dir/folly/experimental/io/IoUringProvidedBufferRing.cpp.o] Error 1
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp: In member function ‘void folly::IoUringBackend::initSubmissionLinked()’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:1103:20: error: ISO C++ forbids declaration of ‘IoUringProvidedBufferRing’ with no type [-fpermissive]
 1103 |     } catch (const IoUringProvidedBufferRing::LibUringCallError& ex) {
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:1103:45: error: expected ‘)’ before ‘::’ token
 1103 |     } catch (const IoUringProvidedBufferRing::LibUringCallError& ex) {
      |             ~                               ^~
      |                                             )
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:1103:45: error: expected ‘{’ before ‘::’ token
 1103 |     } catch (const IoUringProvidedBufferRing::LibUringCallError& ex) {
      |                                             ^~
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:1103:47: error: ‘::LibUringCallError’ has not been declared
 1103 |     } catch (const IoUringProvidedBufferRing::LibUringCallError& ex) {
      |                                               ^~~~~~~~~~~~~~~~~
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:1103:66: error: ‘ex’ was not declared in this scope; did you mean ‘exp’?
 1103 |     } catch (const IoUringProvidedBufferRing::LibUringCallError& ex) {
      |                                                                  ^~
      |                                                                  exp
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp: In member function ‘void folly::IoUringBackend::cancel(folly::IoSqeBase*)’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:1396:5: error: ‘::io_uring_prep_cancel64’ has not been declared; did you mean ‘io_uring_prep_cancel’?
 1396 |   ::io_uring_prep_cancel64(sqe, (uint64_t)ioSqe, 0);
      |     ^~~~~~~~~~~~~~~~~~~~~~
      |     io_uring_prep_cancel
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp: In member function ‘int folly::IoUringBackend::submitBusyCheck(int, folly::IoUringBackend::WaitForEventsMode)’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:1617:19: error: ‘::io_uring_submit_and_wait_timeout’ has not been declared; did you mean ‘io_uring_submit_and_wait’?
 1617 |           res = ::io_uring_submit_and_wait_timeout(
      |                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                   io_uring_submit_and_wait
In file included from /home/jm/CacheLib/cachelib/external/folly/folly/GLog.h:24,
                 from /home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:21:
/home/jm/CacheLib/opt/cachelib/include/glog/logging.h: In instantiation of ‘void google::MakeCheckOpValueString(std::ostream*, const T&) [with T = void(void*); std::ostream = std::basic_ostream<char>]’:
/home/jm/CacheLib/opt/cachelib/include/glog/logging.h:786:25:   required from ‘std::string* google::MakeCheckOpString(const T1&, const T2&, const char*) [with T1 = void (*)(void*); T2 = void(void*); std::string = std::__cxx11::basic_string<char>]’
/home/jm/CacheLib/opt/cachelib/include/glog/logging.h:809:1:   required from ‘std::string* google::Check_EQImpl(const T1&, const T2&, const char*) [with T1 = void (*)(void*); T2 = void(void*); std::string = std::__cxx11::basic_string<char>]’
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:805:5:   required from here
/home/jm/CacheLib/opt/cachelib/include/glog/logging.h:723:9: warning: the compiler can assume that the address of ‘v’ will never be NULL [-Waddress]
  723 |   (*os) << v;
      |   ~~~~~~^~~~
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp: In member function ‘virtual void folly::AsyncIoUringSocket::ReadSqe::processSubmit(io_uring_sqe*)’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp:714:26: error: ‘IORING_RECV_MULTISHOT’ was not declared in this scope
  714 |           ioprio_flags = IORING_RECV_MULTISHOT;
      |                          ^~~~~~~~~~~~~~~~~~~~~
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp: In member function ‘void folly::{anonymous}::SignalRegistry::notify(int)’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:112:14: warning: ignoring return value of ‘ssize_t write(int, const void*, size_t)’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  112 |       ::write(fd, &sigNum, 1);
      |       ~~~~~~~^~~~~~~~~~~~~~~~
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp: In member function ‘virtual void folly::AsyncIoUringSocket::WriteSqe::processSubmit(io_uring_sqe*)’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp:872:7: error: ‘::io_uring_prep_sendmsg_zc’ has not been declared; did you mean ‘io_uring_prep_sendmsg’?
  872 |     ::io_uring_prep_sendmsg_zc(
      |       ^~~~~~~~~~~~~~~~~~~~~~~~
      |       io_uring_prep_sendmsg
make[2]: *** [CMakeFiles/folly_base.dir/build.make:1728: CMakeFiles/folly_base.dir/folly/experimental/io/IoUringBackend.cpp.o] Error 1
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp: In lambda function:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp:1277:17: error: ‘IORING_CQE_F_NOTIF’ was not declared in this scope; did you mean ‘IORING_CQE_F_MORE’?
 1277 |     if (flags & IORING_CQE_F_NOTIF) {
      |                 ^~~~~~~~~~~~~~~~~~
      |                 IORING_CQE_F_MORE
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp: In member function ‘virtual void folly::AsyncIoUringSocket::WriteSqe::callbackCancelled(const io_uring_cqe*)’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp:1296:38: error: ‘IORING_CQE_F_NOTIF’ was not declared in this scope; did you mean ‘IORING_CQE_F_MORE’?
 1296 |           << " notif=" << !!(flags & IORING_CQE_F_NOTIF);
      |                                      ^~~~~~~~~~~~~~~~~~
      |                                      IORING_CQE_F_MORE
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp: In member function ‘virtual void folly::AsyncIoUringSocket::WriteSqe::callback(const io_uring_cqe*)’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp:1313:38: error: ‘IORING_CQE_F_NOTIF’ was not declared in this scope; did you mean ‘IORING_CQE_F_MORE’?
 1313 |           << " notif=" << !!(flags & IORING_CQE_F_NOTIF)
      |                                      ^~~~~~~~~~~~~~~~~~
      |                                      IORING_CQE_F_MORE
make[2]: *** [CMakeFiles/folly_base.dir/build.make:1644: CMakeFiles/folly_base.dir/folly/experimental/io/AsyncIoUringSocket.cpp.o] Error 1
make[2]: Leaving directory '/home/jm/CacheLib/build-folly'
make[1]: *** [CMakeFiles/Makefile2:145: CMakeFiles/folly_base.dir/all] Error 2
make[1]: Leaving directory '/home/jm/CacheLib/build-folly'
make: *** [Makefile:136: all] Error 2
build-package.sh: error: make failed
build.sh: error: failed to build dependency 'folly'```
arungeorge83 commented 5 months ago

Building and installing iouring from source looks to be working.

The following is a method which works for FDP.

git clone https://github.com/axboe/liburing.git -- Follow build process in the Readme - configure & make & make install

git clone https://github.com/facebook/CacheLib.git git checkout tags/v20240320_stable --- Build. Ex: sudo ./contrib/build.sh -j -v –d

jmhands commented 5 months ago

After building liburing from source it still fails at Folly. I can get it to go farther by linking sudo ln -sf /usr/lib/x86_64-linux-gnu/liburing.so.2 /usr/lib/x86_64-linux-gnu/liburing.so but there is still something weird

jm@z690ace:~/liburing$ sudo make install
sed -e "s%@prefix@%/usr%g" \
    -e "s%@libdir@%/usr/lib%g" \
    -e "s%@includedir@%/usr/include%g" \
    -e "s%@NAME@%liburing%g" \
    -e "s%@VERSION@%2.6%g" \
    liburing.pc.in >liburing.pc
sed -e "s%@prefix@%/usr%g" \
    -e "s%@libdir@%/usr/lib%g" \
    -e "s%@includedir@%/usr/include%g" \
    -e "s%@NAME@%liburing%g" \
    -e "s%@VERSION@%2.6%g" \
    liburing-ffi.pc.in >liburing-ffi.pc
make[1]: Entering directory '/home/jm/liburing/src'
install -D -m 644 include/liburing/io_uring.h /usr/include/liburing/io_uring.h
install -D -m 644 include/liburing.h /usr/include/liburing.h
install -D -m 644 include/liburing/compat.h /usr/include/liburing/compat.h
install -D -m 644 include/liburing/barrier.h /usr/include/liburing/barrier.h
install -D -m 644 include/liburing/io_uring_version.h /usr/include/liburing/io_uring_version.h
install -D -m 644 liburing.a /usr/lib/liburing.a
install -D -m 644 liburing-ffi.a /usr/lib/liburing-ffi.a
install -D -m 755 liburing.so.2.6 /usr/lib/liburing.so.2.6
install -D -m 755 liburing-ffi.so.2.6 /usr/lib/liburing-ffi.so.2.6
ln -sf liburing.so.2.6 /usr/lib/liburing.so.2
ln -sf liburing.so.2.6 /usr/lib/liburing.so
ln -sf liburing-ffi.so.2.6 /usr/lib/liburing-ffi.so.2
ln -sf liburing-ffi.so.2.6 /usr/lib/liburing-ffi.so
make[1]: Leaving directory '/home/jm/liburing/src'
install -D -m 644 liburing.pc /usr/lib/pkgconfig/liburing.pc
install -D -m 644 liburing-ffi.pc /usr/lib/pkgconfig/liburing-ffi.pc
install -m 755 -d /usr/man/man2
install -m 644 man/*.2 /usr/man/man2
install -m 755 -d /usr/man/man3
install -m 644 man/*.3 /usr/man/man3
install -m 755 -d /usr/man/man7
install -m 644 man/*.7 /usr/man/man7
jm@z690ace:~/liburing$ locate liburing.so
/home/jm/liburing/src/liburing.so.2.6
/snap/lxd/27037/lib/liburing.so
/snap/lxd/27037/lib/liburing.so.2
/snap/lxd/27037/lib/liburing.so.2.5
/snap/lxd/27948/lib/liburing.so
/snap/lxd/27948/lib/liburing.so.2
/snap/lxd/27948/lib/liburing.so.2.5
/usr/lib/liburing.so
/usr/lib/liburing.so.2
/usr/lib/liburing.so.2.6
/usr/lib/x86_64-linux-gnu/liburing.so.2
/usr/lib/x86_64-linux-gnu/liburing.so.2.1.0

jm@z690ace:~/CacheLib$ sudo ./contrib/build.sh -j -v -d

fails here
...
[ 94%] Built target folly_base
make  -f CMakeFiles/folly.dir/build.make CMakeFiles/folly.dir/depend
make[2]: Entering directory '/home/jm/CacheLib/build-folly'
cd /home/jm/CacheLib/build-folly && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /home/jm/CacheLib/cachelib/external/folly /home/jm/CacheLib/cachelib/external/folly /home/jm/CacheLib/build-folly /home/jm/CacheLib/build-folly /home/jm/CacheLib/build-folly/CMakeFiles/folly.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/home/jm/CacheLib/build-folly'
make  -f CMakeFiles/folly.dir/build.make CMakeFiles/folly.dir/build
make[2]: Entering directory '/home/jm/CacheLib/build-folly'
make[2]: *** No rule to make target '/usr/lib/x86_64-linux-gnu/liburing.so', needed by 'libfolly.so.0.58.0-dev'.  Stop.
make[2]: Leaving directory '/home/jm/CacheLib/build-folly'
make[1]: *** [CMakeFiles/Makefile2:171: CMakeFiles/folly.dir/all] Error 2
make[1]: Leaving directory '/home/jm/CacheLib/build-folly'
make: *** [Makefile:136: all] Error 2
build-package.sh: error: make failed
build.sh: error: failed to build dependency 'folly'

after linking

[ 97%] Built target follybenchmark
cd /home/jm/CacheLib/build-folly/folly/experimental/exception_tracer && /usr/bin/cmake -E cmake_symlink_library libfolly_exception_tracer.so.0.58.0-dev libfolly_exception_tracer.so.0.58.0-dev libfolly_exception_tracer.so
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_unregister_buf_ring'
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_register_buf_ring'
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_get_events'
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_register_ring_fd'
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_setup'
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_register'
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_submit_and_get_events'
make[2]: Leaving directory '/home/jm/CacheLib/build-folly'
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_submit_and_wait_timeout'
[ 98%] Built target folly_exception_tracer
make  -f folly/experimental/exception_tracer/CMakeFiles/folly_exception_counter.dir/build.make folly/experimental/exception_tracer/CMakeFiles/folly_exception_counter.dir/depend
make[2]: Entering directory '/home/jm/CacheLib/build-folly'
cd /home/jm/CacheLib/build-folly && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /home/jm/CacheLib/cachelib/external/folly /home/jm/CacheLib/cachelib/external/folly/folly/experimental/exception_tracer /home/jm/CacheLib/build-folly /home/jm/CacheLib/build-folly/folly/experimental/exception_tracer /home/jm/CacheLib/build-folly/folly/experimental/exception_tracer/CMakeFiles/folly_exception_counter.dir/DependInfo.cmake --color=
collect2: error: ld returned 1 exit status
make[2]: *** [folly/logging/example/CMakeFiles/logging_example.dir/build.make:125: folly/logging/example/logging_example] Error 1
make[2]: Leaving directory '/home/jm/CacheLib/build-folly'
make[1]: *** [CMakeFiles/Makefile2:331: folly/logging/example/CMakeFiles/logging_example.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
Dependencies file "folly/experimental/exception_tracer/CMakeFiles/folly_exception_counter.dir/ExceptionCounterLib.cpp.o.d" is newer than depends file "/home/jm/CacheLib/build-folly/folly/experimental/exception_tracer/CMakeFiles/folly_exception_counter.dir/compiler_depend.internal".
Consolidate compiler generated dependencies of target folly_exception_counter
make[2]: Leaving directory '/home/jm/CacheLib/build-folly'
make  -f folly/experimental/exception_tracer/CMakeFiles/folly_exception_counter.dir/build.make folly/experimental/exception_tracer/CMakeFiles/folly_exception_counter.dir/build
make[2]: Entering directory '/home/jm/CacheLib/build-folly'
[ 98%] Linking CXX shared library libfolly_exception_counter.so
cd /home/jm/CacheLib/build-folly/folly/experimental/exception_tracer && /usr/bin/cmake -E cmake_link_script CMakeFiles/folly_exception_counter.dir/link.txt --verbose=YES
/usr/bin/c++ -fPIC -g -g -Wall -Wextra -shared -Wl,-soname,libfolly_exception_counter.so.0.58.0-dev -o libfolly_exception_counter.so.0.58.0-dev CMakeFiles/folly_exception_counter.dir/ExceptionCounterLib.cpp.o  -Wl,-rpath,/home/jm/CacheLib/build-folly/folly/experimental/exception_tracer:/home/jm/CacheLib/build-folly:/home/jm/CacheLib/opt/cachelib/lib: libfolly_exception_tracer.so.0.58.0-dev libfolly_exception_tracer_base.so.0.58.0-dev ../../../libfolly.so.0.58.0-dev /home/jm/CacheLib/opt/cachelib/lib/libfmtd.so.10.2.1 /usr/lib/x86_64-linux-gnu/libboost_context.so.1.74.0 /usr/lib/x86_64-linux-gnu/libboost_filesystem.so.1.74.0 /usr/lib/x86_64-linux-gnu/libboost_program_options.so.1.74.0 /usr/lib/x86_64-linux-gnu/libboost_regex.so.1.74.0 /usr/lib/x86_64-linux-gnu/libboost_system.so.1.74.0 /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.74.0 /usr/lib/x86_64-linux-gnu/libboost_atomic.so.1.74.0 -ldouble-conversion /home/jm/CacheLib/opt/cachelib/lib/libgflags_debug.so.2.2.2 /home/jm/CacheLib/opt/cachelib/lib/libglogd.so -levent -lz -lssl -lcrypto -lbz2 -llzma -llz4 /home/jm/CacheLib/opt/cachelib/lib/libzstd.so -lsnappy -ldwarf -Wl,-Bstatic -liberty -Wl,-Bdynamic -laio -luring -lsodium -ldl -lunwind
cd /home/jm/CacheLib/build-folly/folly/experimental/exception_tracer && /usr/bin/cmake -E cmake_symlink_library libfolly_exception_counter.so.0.58.0-dev libfolly_exception_counter.so.0.58.0-dev libfolly_exception_counter.so
make[2]: Leaving directory '/home/jm/CacheLib/build-folly'
[ 99%] Built target folly_exception_counter
make[1]: Leaving directory '/home/jm/CacheLib/build-folly'
make: *** [Makefile:136: all] Error 2
build-package.sh: error: make failed
build.sh: error: failed to build dependency 'folly'
jaesoo-fb commented 5 months ago

@jmhands Is this error occurring even with clean build after removing liburing-dev package?

Yeah, those flags like IORING_CQE_F_NOTIF and IORING_RECV_MULTISHOT seems to be added in liburing-2.3 and kernel v6.0. Not sure what version of liburing-dev is installed by default in Ubuntu 22.04.4 LTS, but it would make sense to build/install from source.

Doesn't liburing has debian build as well? https://github.com/axboe/liburing/blob/master/make-debs.sh

jmhands commented 4 months ago

I was able to get it working with the following steps

  1. clean Ubuntu 22.04.4 install with HWE kernel 6.5
  2. sudo apt remove liburing2
  3. sudo apt install build-essential
  4. git clone https://github.com/axboe/liburing.git
    cd liburing
    ./configure --cc=gcc --cxx=g++;
    make -j$(nproc);
    sudo make install;
  5. git clone https://github.com/facebook/CacheLib
    cd CacheLib
    ./contrib/build.sh -d -j -v

    this builds correctly but then I get an error when I run cachebench

    bin/cachebench: /lib/x86_64-linux-gnu/liburing.so.2: version `LIBURING_2.2' not found (required by /home/jm/CacheLib/opt/cachelib/bin/../lib/libfolly.so.0.58.0-dev)
    bin/cachebench: /lib/x86_64-linux-gnu/liburing.so.2: version `LIBURING_2.3' not found (required by /home/jm/CacheLib/opt/cachelib/bin/../lib/libfolly.so.0.58.0-dev)

but was able to resolve with

ls -l /usr/lib/x86_64-linux-gnu/liburing.so*
ls -l /usr/lib/liburing.so*

now that cachebench works... 6. sudo apt install awscli aws s3 cp --no-sign-request --recursive s3://cachelib-workload-sharing/pub/kvcache/202206/ ./

add these into config

      "navyQDepth": 1,
      "navyEnableIoUring": true,
      "deviceEnableFDP": true,
  1. run cachebench with sudo bin/cachebench -json_test_config=test_configs/trace_replay/202206/config_kvcache.json -progress=600 -progress_stats_file=cachebench-progress.log
FletcherAtFADU commented 4 months ago

Hi, @jaesoo-fb

I have a few questions about testing Cachebench after enabling FDP.

1. WAF Expectation for KVCache When running KVCache with FDP enable, It seems that the host allocate only one placement handle for BlockCache(no bighash). How can FDP take advantage in this scenario compared to Non-FDP?

image

2. [Error] We saw the IO Error Issue in the log, but the test didn't fail. It seems like an IO Error shouldn't occur when looking at the code, but I don't quite understand it.

image

I hope below code is working well, this IO error never happen. Am I missing anything?

image image

3. [Fatal Error] The test failed due to an out of range issue. It seems like the size I issued isn't being counted properly, similar to the IO Error issue. It seems that the Write does not increase the region.getLastEntryEndOffset() by Size. Could you please check if there's any issue with the code in this part as well? (I encountered this issue when running KVCache with FDP enabled deivce (RUH 1))

image

arungeorge83 commented 3 months ago

Hi @FletcherAtFADU ,

  1. WAF Expectation for KVCache

You might have selected a kvcache workload without BH enabled. Could you please check the "navyBigHashSizePct" in the config.json file of the workload selected.

  1. [Error] We saw the IO Error Issue in the log, but the test didn't fail.
  2. [Fatal Error] The test failed due to an out of range issue.

Could you check the "nvmCacheSizeMB" with your device size. Looks like "nvmCacheSizeMB" might be going above the NVMe NS/partition that you have chosen.

Could you attach the config.json and initialization/run logs of the cachebench. That would help to analyze it better.

FletcherAtFADU commented 3 months ago

Hi, @arungeorge83 Thank you for giving the information.! Here are the logs and few queries about your answers.

=> I've checked that the default setting of "navyBigHashSizePct" is 0. In KVCache, to check the difference in WAF between FDP and Non-FDP, I think 'navyBigHashSizePct' must absolutely not be zero! Is this correct?

=>nvmCacheSizeMB set 932000 (MB), but the device capacity is 1.25TB (1250602278912 Bytes) which is over than nvmCacheSizeMB.

stats_240508_113152.log output_240508_113152 (1).log

arungeorge83 commented 3 months ago

@FletcherAtFADU

I think 'navyBigHashSizePct' must absolutely not be zero! Is this correct?

Yes, it should be non-zero for BigHash enabled cases. (I see that you have used test_configs/ssd_perf/kvcache_l2_wc/ which does not have bighash enabled). Please use the production traces mentioned at https://cachelib.org/docs/Cache_Library_User_Guides/Cachebench_FB_HW_eval#running-cachebench-with-the-trace-workload for FDP experiments. Or you can use test_configs/ssd_perf/flat_kvcache_reg after changing the device from /dev/md0 to /dev/nvme0n1.

The IO errors looks interesting. Does the non-FDP mode works fine? And FDP mode uses iouring-passthru. Could you run some of the https://github.com/axboe/liburing/tree/master/examples and see if iouring path is fine on your device and system.

FletcherAtFADU commented 3 months ago

@arungeorge83 We've tested the examples you suggested and haven't encountered any issues. Initially, we set up our environment on CentOS, and we're currently cross-checking to see if there are any issues on Ubuntu.

Additionally, even when an IO Error occurs, the tests continue to run. We are currently enabling BigHash in the KVCache and comparing FDP versus Non-FDP. I'll check the results and get back to you if there are any problems.

gaowayne commented 3 months ago

@arungeorge83 buddy, could you please share your CacheLib test config for FDP SSD?

arungeorge83 commented 3 months ago

@gaowayne Please find the sample FDP config used with kvcache production traces. "cache_config": { "cacheSizeMB": 20000, "cacheDir": "/root/cachelib_metadata-1", "allocFactor": 1.08, "maxAllocSize": 524288, "minAllocSize": 64, "navyReaderThreads": 72, "navyWriterThreads": 36, "nvmCachePaths": ["/dev/nvme0n1"], "nvmCacheSizeMB" : 878700, "writeAmpDeviceList": ["nvme0n1"], "navyBigHashBucketSize": 4096, "navyBigHashSizePct": 4, "navySmallItemMaxSize": 640, "navySegmentedFifoSegmentRatio": [1.0], "navyHitsReinsertionThreshold": 1, "navyBlockSize": 4096, "deviceMaxWriteSize": 262144, "nvmAdmissionRetentionTimeThreshold": 7200, "navyParcelMemoryMB": 6048, "enableChainedItem": true, "htBucketPower": 29, "navyQDepth": 1, "navyEnableIoUring": true, "deviceEnableFDP": true, "moveOnSlabRelease": false, "poolRebalanceIntervalSec": 2, "poolResizeIntervalSec": 2, "rebalanceStrategy": "hits" }, "test_config": { "opRatePerSec": 1000000, "opRateBurstSize": 200, "enableLookaside": false, "generator": "replay", "replayGeneratorConfig": { "ampFactor": 200 }, "repeatTraceReplay": true, "repeatOpCount" : true, "onlySetIfMiss" : false, "numOps": 100000000000, "numThreads": 10, "prepopulateCache": true, "traceFileNames": [ "kvcache_traces_1.csv", "kvcache_traces_2.csv", "kvcache_traces_3.csv", "kvcache_traces_4.csv", "kvcache_traces_5.csv" ] } }

FletcherAtFADU commented 3 months ago

@arungeorge83 @gaowayne Thanks to your help, we achieved proper evaluation results after enabling FDP. Using the KV Trace evaluation with arungeorge83's provided config worked perfectly. image

There is still one thing that I do not understand. We faced a fatal error due to an out-of-range issue, which was resolved by reducing the nvmCacheSizeMB. However, we are using a device with a size of 1.25TB (1250602278912 Bytes).!! The default "nvmCacheSizeMB" of 932000 works well in non-FDP mode but causes errors in FDP mode. However, the new config with "nvmCacheSizeMB" set to 878700 worked fine in FDP mode.!!

1 More Question: If the device support up to 8 RUs per NS and multi namespace as well. Can we adjust the parameters to allocate more RUs per NS or set it up to effectively demonstrate the benefits of using multiple namespaces?

arungeorge83 commented 3 months ago

@FletcherAtFADU It is great to know that you are able to re-produce the results.

Can we adjust the parameters to allocate more RUs per NS or set it up to effectively demonstrate the benefits of using multiple namespaces?

The current code does not support that, though a configurable RUH allocation mechanism is in thoughts.

We faced a fatal error due to an out-of-range issue, which was resolved by reducing the nvmCacheSizeMB.

Interesting. We were able to test with the full capacity of the device. Just curious, is this issue somehow related to the Number of RGs and RU Size of the FDP device that you are using?

gaowayne commented 2 months ago

@arungeorge83 @gaowayne Thanks to your help, we achieved proper evaluation results after enabling FDP. Using the KV Trace evaluation with arungeorge83's provided config worked perfectly. image

There is still one thing that I do not understand. We faced a fatal error due to an out-of-range issue, which was resolved by reducing the nvmCacheSizeMB. However, we are using a device with a size of 1.25TB (1250602278912 Bytes).!! The default "nvmCacheSizeMB" of 932000 works well in non-FDP mode but causes errors in FDP mode. However, the new config with "nvmCacheSizeMB" set to 878700 worked fine in FDP mode.!!

1 More Question: If the device support up to 8 RUs per NS and multi namespace as well. Can we adjust the parameters to allocate more RUs per NS or set it up to effectively demonstrate the benefits of using multiple namespaces?

@arungeorge83 buddy, may I know if I try to build, I can start from here? https://github.com/arungeorge83/CacheLib/tree/fdp/fdp_upstream_PR2

arungeorge83 commented 2 months ago

@arungeorge83 @gaowayne Thanks to your help, we achieved proper evaluation results after enabling FDP. Using the KV Trace evaluation with arungeorge83's provided config worked perfectly. image There is still one thing that I do not understand. We faced a fatal error due to an out-of-range issue, which was resolved by reducing the nvmCacheSizeMB. However, we are using a device with a size of 1.25TB (1250602278912 Bytes).!! The default "nvmCacheSizeMB" of 932000 works well in non-FDP mode but causes errors in FDP mode. However, the new config with "nvmCacheSizeMB" set to 878700 worked fine in FDP mode.!! 1 More Question: If the device support up to 8 RUs per NS and multi namespace as well. Can we adjust the parameters to allocate more RUs per NS or set it up to effectively demonstrate the benefits of using multiple namespaces?

@arungeorge83 buddy, may I know if I try to build, I can start from here? https://github.com/arungeorge83/CacheLib/tree/fdp/fdp_upstream_PR2

@gaowayne yes. And you can use the latest working code from main branch also. That PR was already merged

gaowayne commented 2 months ago

@arungeorge83 thank you so much man. I try to build cachelib, but suffer this on my OS. :(

FM2CV704-CYP16 ~/wayne/CacheLib
# ./contrib/build.sh -d -j -v
build.sh: error: No build recipe for detected operating system 'rhel9.0'
MaisenbacherD commented 1 month ago

I was seeing the same IO Erros that @FletcherAtFADU reported in this thread. The issue is that the deviceMaxWriteSize parameter's default value does not line up with the blk_rq_get_max_segments for a specific request-queue (see https://github.com/torvalds/linux/blob/1613e604df0cd359cf2a7fbd9be7a0bcfacfabd0/block/blk-merge.c#L619). For nvme over pci this limit is set to 128 segments https://github.com/torvalds/linux/blob/afcd48134c58d6af45fb3fdb648f1260b20f2326/drivers/nvme/host/pci.c#L2992C27-L2992C40 which results in 524288Bytes (=NVME_MAX_SEGS(128) << PAGE_SHIFT(4096)) if the segments are completely disjoint. One can adjust the deviceMaxWriteSize in the config json accordingly.

However, it would possibly be better to either query the device capabilities when setting the deviceMaxWriteSize or decrease the default value of deviceMaxWriteSizeto 128*4096 Bytes. There is an approach to increase the deviceMaxWriteSize without getting back an -EINVAL by the kernel. If the NVMe device supports SGLs, one could make sure that the pages for I/O are in continuous memory and therefore less scattered, hence the number of segments for a single request is not as high anymore.

@jaesoo-fb Do you think this is an actual issue worth investigating a bit more to post a patch?

therealgymmy commented 1 month ago

@MaisenbacherD: yes. We'd really appreciate it if you could send out a patch for this. It'd be good to detect the right setting for deviceMaxWriteSize on start up.

ByteMansion commented 1 week ago

Dear all, @arungeorge83 @MaisenbacherD I use latest CacheLib main branch and liburing, but still cannot run FDP test successfully with following segmentation fault error. Can you kindly help solve this issue or give some suggestions? Thanks a lot! Btw, if I didn't add deviceEnableFDP, everything works well. Environment: Ubuntu 20.04 Kernel version: 5.15.0-117-generic


===JSON Config===
{
  "cache_config": {
    "cacheSizeMB": 2048,
    "allocFactor":1.08,
    "maxAllocSize":524288,
    "minAllocSize":64,
    "navyReaderThreads": 2,
    "navyWriterThreads": 2,
    "nvmCachePaths": ["/dev/nvme0n1"],
    "writeAmpDeviceList":["nvme0n1"],
    "nvmCacheSizeMB": 102400,
    "navyBigHashSizePct": 4,
    "navyBlockSize": 4096,
    "navyParcelMemoryMB": 2048,
    "nvmAdmissionRetentionTimeThreshold": 7200,
    "navySmallItemMaxSize": 640,
    "navySegmentedFifoSegmentRatio": [1.0],
    "navyHitsReinsertionThreshold": 1,
    "htBucketPower": 26,
    "moveOnSlabRelease": false,
    "poolRebalanceIntervalSec": 2,
    "rebalanceStrategy":"hits",
    "navyQDepth": 1,
    "deviceEnableFDP": true,
    "navyEnableIoUring": true
  },
  "test_config": {
    "enableLookaside": true,
    "generator": "online",
    "numKeys": 72298,
    "numOps": 63000,
    "numThreads": 2,
    "poolDistributions": [
      {
        "addChainedRatio": 0.0,
        "delRatio": 0.0,
        "getRatio": 0.6,
        "keySizeRange": [
          8,
          16
        ],
        "keySizeRangeProbability": [
          1.0
        ],
        "loneGetRatio": 8.2e-06,
        "loneSetRatio": 0.21,
        "setRatio": 0.0,
        "popDistFile": "pop.json",
        "setRatio": 0.0,
        "valSizeDistFile": "sizes.json"
      }
    ],

    "opDelayNs": 5000000,
    "opDelayBatch": 1
  }
}

reading distribution params from ./test_configs/ssd_perf/kvcache_l2_wc/sizes.json
reading distribution params from ./test_configs/ssd_perf/kvcache_l2_wc/pop.json
Welcome to OSS version of cachebench
E0826 18:14:42.695594  3179 Cache.h:498] Exception fetching nand writes for nvme0n1. Msg: Vendor not recogized in device model number longsys rsze5f38p-3840
I0826 18:14:42.695710  3179 Cache.h:622] Configuring NVM cache: simple file /dev/nvme0n1 size 102400 MB
I0826 18:14:42.695797  3179 Cache.h:711] Using the following nvm config{
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::QDepth": "1",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::admissionPolicy": "",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::admissionProbBaseSize": "0",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::admissionProbFactorLowerBound": "0",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::admissionProbFactorUpperBound": "0",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::admissionProbability": "0",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::admissionSuffixLen": "0",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::admissionWriteRate": "0",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::bigHashBucketBfSize": "8",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::bigHashBucketSize": "4096",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::bigHashSizePct": "4",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::bigHashSmallItemMaxSize": "640",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::blockCacheCleanRegionThreads": "1",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::blockCacheCleanRegions": "1",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::blockCacheDataChecksum": "true",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::blockCacheLru": "false",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::blockCacheNumInMemBuffers": "2",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::blockCacheRegionSize": "16777216",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::blockCacheReinsertionHitsThreshold": "1",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::blockCacheReinsertionPctThreshold": "0",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::blockCacheSegmentedFifoSegmentRatio": "",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::blockSize": "4096",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::deviceMaxWriteSize": "1048576",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::deviceMetadataSize": "0",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::enableFDP": "1",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::fileName": "/dev/nvme0n1",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::fileSize": "107374182400",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::ioEngine": "io_uring",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::maxConcurrentInserts": "1000000",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::maxNumReads": "0",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::maxNumWrites": "0",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::maxParcelMemoryMB": "2048",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::maxWriteRate": "0",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::navyReqOrderingShards": "21",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::raidPaths": "",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::readerThreads": "2",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::stackSize": "16384",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::truncateFile": "false",
I0826 18:14:42.695797  3179 Cache.h:711]   "navyConfig::writerThreads": "2"
I0826 18:14:42.695797  3179 Cache.h:711] }
I0826 18:14:42.833516  3179 Device.cpp:1113] Cache file: /dev/nvme0n1 size: 107374182400 truncate: 0
I0826 18:14:42.833629  3179 FdpNvme.cpp:316] Opening NVMe Char Dev file: /dev/ng0n1
I0826 18:14:42.833694  3179 FdpNvme.cpp:276] Nvme Device Info, NS Id: 1, lbaShift: 9, Max Transfer size: 262144, start Lba: 0
I0826 18:14:42.834096  3179 FdpNvme.cpp:39] Initialized NVMe FDP Device on file: /dev/nvme0n1
I0826 18:14:42.834125  3179 Device.cpp:991] Created device with num_devices 1 size 107374182400 block_size 4096,stripe_size 0 max_write_size 262144 max_io_size 262144 io_engine io_uring qdepth 1,num_fdp_devices 1
I0826 18:14:42.893455  3179 NavySetup.cpp:245] metadataSize: 536870912
I0826 18:14:42.893474  3179 NavySetup.cpp:247] Setting up engine pair 0
I0826 18:14:42.893478  3179 NavySetup.cpp:113] bighashStartingLimit: 536870912 bigHashCacheOffset: 103079215104 bigHashCacheSize: 4294967296
I0826 18:14:42.893483  3179 NavySetup.cpp:261] blockCacheSize 102542344192
I0826 18:14:42.893486  3179 NavySetup.cpp:158] blockcache: starting offset: 536870912, block cache size: 102542344192
I0826 18:14:42.893498  3179 FifoPolicy.cpp:37] FIFO policy
I0826 18:14:42.896829  3179 FdpNvme.cpp:58] Allocated an FDP handle 1
I0826 18:14:42.897451  3179 BigHash.cpp:93] BigHash created: buckets: 1048576, bucket size: 4096, base offset: 103079215104
I0826 18:14:42.897456  3179 BigHash.cpp:102] Reset BigHash
I0826 18:14:42.897786  3179 BigHash.h:274] For ValidBucketChecker, allocating 1311 bytes for 1048576 buckets at 100 buckets per bit.
I0826 18:14:42.900052  3179 FdpNvme.cpp:58] Allocated an FDP handle 2
I0826 18:14:42.900074  3179 RegionManager.cpp:50] 6112 regions, 16777216 bytes each
I0826 18:14:42.901172  3185 RegionManager.cpp:68] region_manager_0 started
I0826 18:14:42.901815  3179 Allocator.cpp:39] Enable priority-based allocation for Allocator. Number of priorities: 1
I0826 18:14:42.901835  3179 BlockCache.cpp:145] Block cache created
I0826 18:14:42.901849  3179 Driver.cpp:70] Max concurrent inserts: 1000000
I0826 18:14:42.901853  3179 Driver.cpp:71] Max parcel memory: 2147483648
I0826 18:14:42.901858  3179 Driver.cpp:72] Use Write Estimated Size: false
I0826 18:14:42.901863  3179 Driver.cpp:209] Reset Navy
I0826 18:14:42.901868  3179 BigHash.cpp:102] Reset BigHash
I0826 18:14:42.902235  3179 BigHash.h:274] For ValidBucketChecker, allocating 1311 bytes for 1048576 buckets at 100 buckets per bit.
I0826 18:14:42.902243  3179 BlockCache.cpp:735] Reset block cache
Total 0.13M ops to be run
E0826 18:14:42.910065  3189 Cache.h:498] Exception fetching nand writes for nvme0n1. Msg: Vendor not recogized in device model number longsys rsze5f38p-3840
18:14:42       0.00M ops completed. Hit Ratio   0.00% (RAM   0.00%, NVM   0.00%)
E0826 18:15:42.919692  3189 Cache.h:498] Exception fetching nand writes for nvme0n1. Msg: Vendor not recogized in device model number longsys rsze5f38p-3840
18:15:42       0.02M ops completed. Hit Ratio  56.80% (RAM  56.80%, NVM   0.00%)
*** Aborted at 1724667401 (Unix time, try 'date -d @1724667401') ***
*** Signal 11 (SIGSEGV) (0x7f5c5af6f0b8) received by PID 3179 (pthread TID 0x7f5c5b76f700) (linux TID 3185) (code: invalid permissions for mapped object), stack trace: ***
segmentation fault (core dump)
arungeorge83 commented 1 week ago

@ByteMansion Please use a Linux Kernel version 6.1.32 and higher. This is mostly a problem in io_uring support in the kernel path. Let us know if you still face the issue.

ByteMansion commented 1 week ago

@ByteMansion Please use a Linux Kernel version 6.1.32 and higher. This is mostly a problem in io_uring support in the kernel path. Let us know if you still face the issue.

I upgrade the kernel version to 6.2 and segmentation fault disappeared. Thanks for your help. @arungeorge83