leil-io / saunafs

SaunaFS is a free-and open source, distributed POSIX file system inspired by Google File System.
https://saunafs.com
GNU General Public License v3.0
62 stars 5 forks source link

fix(chunkserver): Use option HDD_ADVISE_NO_CACHE #227

Closed lgsilva3087 closed 1 month ago

lgsilva3087 commented 1 month ago

The option HDD_ADVISE_NO_CACHE was not used. It is now used inside the hddIOEnd function for both, the metadata and data files for each chunk.

Enabling the option in the chunkserver configuration file effectively reduces the page cache size used by the chunkservers, but also could reduce the performance in some scenarios.

aNeutrino commented 1 month ago

@lgsilva3087 I propose to leave your test and write also second one which chatGPT is proposing.

Also you will be happy to read that there are commands which allow to check if files are in CACHE and how much of this files is in cache

vmtouch fincore

apt install vmtouch util-linux-extra could be add to our test library dependency

now below you can find chatGPT o1-preview comment.

Explanation of the Test Script:


Purpose of the Test:

The test aims to verify that the HDD_ADVISE_NO_CACHE configuration option in the SaunaFS chunkserver correctly influences the system's page cache behavior during large file write operations. Specifically, it checks whether enabling HDD_ADVISE_NO_CACHE reduces the amount of data cached in the system's memory during writes.


Step-by-Step Explanation:

  1. Test Timeout Setting:

    timeout_set 2 minutes
    • Sets a maximum execution time of 2 minutes for the test to prevent it from running indefinitely.
  2. Environment Setup:

    CHUNKSERVERS=1 \
     MOUNT_EXTRA_CONFIG="sfscachemode=NEVER" \
     CHUNKSERVER_EXTRA_CONFIG="HDD_ADVISE_NO_CACHE = 0" \
     setup_local_empty_saunafs info
    • Configures the test environment with:
      • One chunkserver (CHUNKSERVERS=1).
      • Client-side cache disabled (sfscachemode=NEVER) to ensure we are testing server-side caching behavior.
      • HDD_ADVISE_NO_CACHE set to 0 (disabled) in the chunkserver configuration, meaning the chunkserver will not advise the OS to skip caching.
  3. Change to the Mount Point Directory:

    cd "${info[mount0]}"
    • Changes the working directory to the SaunaFS mount point for file operations.
  4. Function to Get System Cached Memory Size:

    function getSystemCachedMemorySizeKB {
     grep '^Cached' /proc/meminfo | awk '{print $2}'
    }
    • Defines a function to read the amount of memory currently used by the system for caching (in kilobytes) from /proc/meminfo.
  5. Wait for Chunkserver to be Ready:

    saunafs_wait_for_all_ready_chunkservers
    • Ensures that the chunkserver is fully operational before proceeding.
  6. Set Sleep Time After Dropping Caches:

    sleepAfterDropCaches=5
    • Defines a delay (in seconds) to wait after dropping caches to ensure the system has settled.
  7. Ensure the System Cache is Clean:

    drop_caches
    sleep ${sleepAfterDropCaches}
    • drop_caches is a function that clears the system's page cache (requires root permissions). This ensures that any previous caching does not affect the test.
    • Waits for 5 seconds to allow the system to settle.
  8. Record Original Cache Size:

    original_cache_size=$(getSystemCachedMemorySizeKB)
    • Stores the initial cached memory size before any test operations.
  9. Write a Large File to Populate the Cache:

    echo "Writing a big file to populate the cache"
    file_size=16384
    dd if=/dev/zero of=file bs=1M count=${file_size} oflag=direct &> /dev/null
    • Writes a 16 GiB file filled with zeros to the SaunaFS filesystem using dd.
    • The oflag=direct option is used to bypass any client-side caching.
    • The intention is to have the chunkserver write data to disk, which may or may not be cached by the OS depending on HDD_ADVISE_NO_CACHE.
  10. Measure Cache Size After Writing:

    cache_size_with_HDD_ADVISE_NO_CACHE_disabled=$(getSystemCachedMemorySizeKB)
    • Records the cached memory size after the write operation when HDD_ADVISE_NO_CACHE is disabled.
  11. Modify Chunkserver Configuration and Restart:

    echo "Restarting the chunkserver to enable HDD_ADVISE_NO_CACHE"
    
    ## Set HDD_ADVISE_NO_CACHE to 1 and restart the chunkserver
    echo "HDD_ADVISE_NO_CACHE = 1" >> "${info[chunkserver0_cfg]}"
    saunafs_chunkserver_daemon 0 restart
    saunafs_wait_for_all_ready_chunkservers
    • Updates the chunkserver configuration to enable HDD_ADVISE_NO_CACHE.
    • Restarts the chunkserver to apply the new configuration.
    • Waits for the chunkserver to become ready.
  12. Clear the System Cache Again:

    drop_caches
    sleep ${sleepAfterDropCaches}
    • Clears the system's page cache once more to ensure a clean state for the next test.
  13. Overwrite the File to Repopulate the Cache:

    echo "Overwriting the file to re-populate the cache"
    dd if=/dev/zero of=file bs=1M count=${file_size} oflag=direct &> /dev/null
    • Overwrites the same 16 GiB file with zeros.
    • This operation is expected to not increase the system's cache significantly if HDD_ADVISE_NO_CACHE is working as intended.
  14. Measure Cache Size After Overwriting:

    cache_size_with_HDD_ADVISE_NO_CACHE_enabled=$(getSystemCachedMemorySizeKB)
    • Records the cached memory size after the write operation when HDD_ADVISE_NO_CACHE is enabled.
  15. Output Summary of Cache Sizes:

    echo "Summary:"
    echo "Cache size: ${original_cache_size} kB - original"
    echo "Cache size: ${cache_size_with_HDD_ADVISE_NO_CACHE_disabled} kB - HDD_ADVISE_NO_CACHE = 0"
    echo "Cache size: ${cache_size_with_HDD_ADVISE_NO_CACHE_enabled} kB - HDD_ADVISE_NO_CACHE = 1"
    • Displays the cached memory sizes at different stages for comparison.
  16. Set Acceptable Margin of Error:

    minimum_difference=$((1024 * 1024))
    • Defines a minimum expected difference of 1 GiB (1,048,576 kB) to account for variations and to consider the test meaningful.
  17. Calculate Differences and Perform Assertions:

    • First Assertion:

      difference=$((cache_size_with_HDD_ADVISE_NO_CACHE_disabled - original_cache_size))
      
      # Ensure the cache was increased while the HDD_ADVISE_NO_CACHE was disabled
      assert_less_than ${minimum_difference} ${difference}
      • Calculates the increase in cache size after writing with HDD_ADVISE_NO_CACHE disabled.
      • Asserts that the increase is greater than 1 GiB, meaning the data was cached.
    • Second Assertion:

      difference=$((cache_size_with_HDD_ADVISE_NO_CACHE_disabled - cache_size_with_HDD_ADVISE_NO_CACHE_enabled))
      
      # Ensure the cache was decreased while the HDD_ADVISE_NO_CACHE was enabled
      assert_less_than ${minimum_difference} ${difference}
      • Calculates the difference in cache size between the two write operations.
      • Asserts that the cache size decreased by at least 1 GiB when HDD_ADVISE_NO_CACHE was enabled, indicating that less data was cached.

Potential Issues and Areas for Improvement:

  1. System Cache Measurement Variability:

    • Measuring total system cached memory can be unreliable due to other system processes and background activities affecting the cache.
    • The test may produce inconsistent results if other applications are using or releasing memory during the test.
  2. Root Permissions Required:

    • The drop_caches function typically requires root permissions (echo 3 > /proc/sys/vm/drop_caches).
    • Running this test as root or with elevated privileges may not be safe or possible in all environments.
  3. Large File Size:

    • Writing a 16 GiB file can consume significant disk space and time, potentially causing the test to exceed the timeout or fail on systems with limited resources.
  4. Use of Direct I/O:

    • Using oflag=direct bypasses the client-side cache, which is intended.
    • However, it ensures that the data is written directly to the chunkserver, and any caching behavior is due to the chunkserver's interaction with the OS.
  5. Assertions May Be Misleading:

    • The use of assert_less_than may be confusing. It checks if the first argument is less than the second, meaning we are asserting that minimum_difference < difference.
    • This can be counterintuitive when we expect a decrease in cache size.

Suggested Improvements:

  1. Reduce File Size:

    • Change file size to 1 GiB to make the test faster and less resource-intensive.

      file_size=1024  # 1 GiB
  2. Use a More Precise Cache Measurement:

    • Instead of total system cached memory, measure the cache usage of the specific file.
    • Option 1: Use tools like vmtouch or fincore to check if the file is cached.
    • Option 2: Write a small script using mincore to check page residency.

    Note: This requires additional tools or scripts and may not be feasible in all environments.

  3. Avoid Root Permissions:

    • If possible, avoid using drop_caches to prevent the need for root permissions.
    • Instead, accept that the cache may not be entirely clean but rely on the relative differences in cache usage.
  4. Add Logging in Chunkserver:

    • Modify the chunkserver to log when posix_fadvise with POSIX_FADV_DONTNEED is called.
    • Verify from the logs that HDD_ADVISE_NO_CACHE is causing the expected system calls.
  5. Adjust Assertions for Clarity:

    • Use assert_greater_than for better readability when expecting the difference to be significant.
    • Alternatively, calculate and assert the cache increase/decrease directly.
  6. Ensure Isolation:

    • Run the test on a system with minimal background activity to reduce variability.
    • If possible, run the test inside a container or VM dedicated to testing.

Improved Test Script:

# Improved test for HDD_ADVISE_NO_CACHE functionality.

timeout_set 2 minutes

CHUNKSERVERS=1 \
  MOUNT_EXTRA_CONFIG="sfscachemode=NEVER" \
  CHUNKSERVER_EXTRA_CONFIG="HDD_ADVISE_NO_CACHE = 0" \
  setup_local_empty_saunafs info

cd "${info[mount0]}"

# Function to get system cached memory size in kilobytes.
function getSystemCachedMemorySizeKB {
  grep '^Cached:' /proc/meminfo | awk '{print $2}'
}

saunafs_wait_for_all_ready_chunkservers

# Set sleep time after dropping caches.
sleepAfterDropCaches=5

# Ensure the cache is clean (requires root privileges).
if [ "$(id -u)" -ne 0 ]; then
  echo "This test requires root permissions to drop caches."
  test_add_failure "Root permissions required."
  exit 1
fi

echo "Dropping caches..."
drop_caches
sleep ${sleepAfterDropCaches}

# Get the original cache size.
original_cache_size=$(getSystemCachedMemorySizeKB)

# Write a big file to populate the cache.
echo "Writing a big file to populate the cache."
file_size=1024  # 1 GiB
dd if=/dev/zero of=file bs=1M count=${file_size} oflag=direct &> /dev/null

# Measure cache size after writing.
cache_size_no_cache_disabled=$(getSystemCachedMemorySizeKB)

# Calculate the increase in cache size.
increase_no_cache_disabled=$((cache_size_no_cache_disabled - original_cache_size))

echo "Cache increase with HDD_ADVISE_NO_CACHE disabled: ${increase_no_cache_disabled} kB"

# Restart chunkserver with HDD_ADVISE_NO_CACHE enabled.
echo "Enabling HDD_ADVISE_NO_CACHE and restarting chunkserver."
echo "HDD_ADVISE_NO_CACHE = 1" >> "${info[chunkserver0_cfg]}"
saunafs_chunkserver_daemon 0 restart
saunafs_wait_for_all_ready_chunkservers

# Ensure the cache is clean again.
echo "Dropping caches..."
drop_caches
sleep ${sleepAfterDropCaches}

# Overwrite the file to re-populate the cache.
echo "Overwriting the file to re-populate the cache."
dd if=/dev/zero of=file bs=1M count=${file_size} oflag=direct &> /dev/null

# Measure cache size after overwriting.
cache_size_no_cache_enabled=$(getSystemCachedMemorySizeKB)

# Calculate the increase in cache size.
increase_no_cache_enabled=$((cache_size_no_cache_enabled - original_cache_size))

echo "Cache increase with HDD_ADVISE_NO_CACHE enabled: ${increase_no_cache_enabled} kB"

# Define acceptable minimum increase (e.g., 100 MB).
minimum_increase=$((100 * 1024))  # 100 MB

# Assertions:

# Ensure that cache increased significantly when HDD_ADVISE_NO_CACHE was disabled.
if [ ${increase_no_cache_disabled} -gt ${minimum_increase} ]; then
  echo "Cache increased as expected when HDD_ADVISE_NO_CACHE was disabled."
else
  test_add_failure "Cache did not increase as expected with HDD_ADVISE_NO_CACHE disabled."
fi

# Ensure that cache increase is significantly less when HDD_ADVISE_NO_CACHE is enabled.
if [ ${increase_no_cache_enabled} -lt ${increase_no_cache_disabled} ]; then
  echo "Cache increase is less when HDD_ADVISE_NO_CACHE is enabled."
else
  test_add_failure "Cache increase did not decrease with HDD_ADVISE_NO_CACHE enabled."
fi

# Ensure that the difference in cache increase is significant.
difference_increase=$((increase_no_cache_disabled - increase_no_cache_enabled))
minimum_difference=$((100 * 1024))  # 100 MB

if [ ${difference_increase} -gt ${minimum_difference} ]; then
  echo "Cache increase difference is significant: ${difference_increase} kB"
else
  test_add_failure "Cache increase difference is not significant."
fi

Explanation of Improvements:


Additional Notes:


Conclusion:

The improved test provides a clearer and more reliable way to verify that the HDD_ADVISE_NO_CACHE configuration in the SaunaFS chunkserver affects the system's caching behavior as expected. By reducing the file size and adjusting the measurement and assertion methods, the test becomes more practical and easier to interpret.

If you have any questions or need further assistance with this test or any other aspect of SaunaFS testing, please let me know!

lgsilva3087 commented 1 month ago

@lgsilva3087 I propose to leave your test and write also second one which chatGPT is proposing.

Also you will be happy to read that there are commands which allow to check if files are in CACHE and how much of this files is in cache

vmtouch fincore

apt install vmtouch util-linux-extra could be add to our test library dependency

now below you can find chatGPT o1-preview comment.

Explanation of the Test Script:

Purpose of the Test:

The test aims to verify that the HDD_ADVISE_NO_CACHE configuration option in the SaunaFS chunkserver correctly influences the system's page cache behavior during large file write operations. Specifically, it checks whether enabling HDD_ADVISE_NO_CACHE reduces the amount of data cached in the system's memory during writes.

Step-by-Step Explanation:

  1. Test Timeout Setting:

    timeout_set 2 minutes
    • Sets a maximum execution time of 2 minutes for the test to prevent it from running indefinitely.
  2. Environment Setup:

    CHUNKSERVERS=1 \
     MOUNT_EXTRA_CONFIG="sfscachemode=NEVER" \
     CHUNKSERVER_EXTRA_CONFIG="HDD_ADVISE_NO_CACHE = 0" \
     setup_local_empty_saunafs info
    • Configures the test environment with:

      • One chunkserver (CHUNKSERVERS=1).
      • Client-side cache disabled (sfscachemode=NEVER) to ensure we are testing server-side caching behavior.
      • HDD_ADVISE_NO_CACHE set to 0 (disabled) in the chunkserver configuration, meaning the chunkserver will not advise the OS to skip caching.
  3. Change to the Mount Point Directory:

    cd "${info[mount0]}"
    • Changes the working directory to the SaunaFS mount point for file operations.
  4. Function to Get System Cached Memory Size:

    function getSystemCachedMemorySizeKB {
     grep '^Cached' /proc/meminfo | awk '{print $2}'
    }
    • Defines a function to read the amount of memory currently used by the system for caching (in kilobytes) from /proc/meminfo.
  5. Wait for Chunkserver to be Ready:

    saunafs_wait_for_all_ready_chunkservers
    • Ensures that the chunkserver is fully operational before proceeding.
  6. Set Sleep Time After Dropping Caches:

    sleepAfterDropCaches=5
    • Defines a delay (in seconds) to wait after dropping caches to ensure the system has settled.
  7. Ensure the System Cache is Clean:

    drop_caches
    sleep ${sleepAfterDropCaches}
    • drop_caches is a function that clears the system's page cache (requires root permissions). This ensures that any previous caching does not affect the test.
    • Waits for 5 seconds to allow the system to settle.
  8. Record Original Cache Size:

    original_cache_size=$(getSystemCachedMemorySizeKB)
    • Stores the initial cached memory size before any test operations.
  9. Write a Large File to Populate the Cache:

    echo "Writing a big file to populate the cache"
    file_size=16384
    dd if=/dev/zero of=file bs=1M count=${file_size} oflag=direct &> /dev/null
    • Writes a 16 GiB file filled with zeros to the SaunaFS filesystem using dd.
    • The oflag=direct option is used to bypass any client-side caching.
    • The intention is to have the chunkserver write data to disk, which may or may not be cached by the OS depending on HDD_ADVISE_NO_CACHE.
  10. Measure Cache Size After Writing:

    cache_size_with_HDD_ADVISE_NO_CACHE_disabled=$(getSystemCachedMemorySizeKB)
    • Records the cached memory size after the write operation when HDD_ADVISE_NO_CACHE is disabled.
  11. Modify Chunkserver Configuration and Restart:

    echo "Restarting the chunkserver to enable HDD_ADVISE_NO_CACHE"
    
    ## Set HDD_ADVISE_NO_CACHE to 1 and restart the chunkserver
    echo "HDD_ADVISE_NO_CACHE = 1" >> "${info[chunkserver0_cfg]}"
    saunafs_chunkserver_daemon 0 restart
    saunafs_wait_for_all_ready_chunkservers
    • Updates the chunkserver configuration to enable HDD_ADVISE_NO_CACHE.
    • Restarts the chunkserver to apply the new configuration.
    • Waits for the chunkserver to become ready.
  12. Clear the System Cache Again:

    drop_caches
    sleep ${sleepAfterDropCaches}
    • Clears the system's page cache once more to ensure a clean state for the next test.
  13. Overwrite the File to Repopulate the Cache:

    echo "Overwriting the file to re-populate the cache"
    dd if=/dev/zero of=file bs=1M count=${file_size} oflag=direct &> /dev/null
    • Overwrites the same 16 GiB file with zeros.
    • This operation is expected to not increase the system's cache significantly if HDD_ADVISE_NO_CACHE is working as intended.
  14. Measure Cache Size After Overwriting:

    cache_size_with_HDD_ADVISE_NO_CACHE_enabled=$(getSystemCachedMemorySizeKB)
    • Records the cached memory size after the write operation when HDD_ADVISE_NO_CACHE is enabled.
  15. Output Summary of Cache Sizes:

    echo "Summary:"
    echo "Cache size: ${original_cache_size} kB - original"
    echo "Cache size: ${cache_size_with_HDD_ADVISE_NO_CACHE_disabled} kB - HDD_ADVISE_NO_CACHE = 0"
    echo "Cache size: ${cache_size_with_HDD_ADVISE_NO_CACHE_enabled} kB - HDD_ADVISE_NO_CACHE = 1"
    • Displays the cached memory sizes at different stages for comparison.
  16. Set Acceptable Margin of Error:

    minimum_difference=$((1024 * 1024))
    • Defines a minimum expected difference of 1 GiB (1,048,576 kB) to account for variations and to consider the test meaningful.
  17. Calculate Differences and Perform Assertions:

    • First Assertion:

      difference=$((cache_size_with_HDD_ADVISE_NO_CACHE_disabled - original_cache_size))
      
      # Ensure the cache was increased while the HDD_ADVISE_NO_CACHE was disabled
      assert_less_than ${minimum_difference} ${difference}
      • Calculates the increase in cache size after writing with HDD_ADVISE_NO_CACHE disabled.
      • Asserts that the increase is greater than 1 GiB, meaning the data was cached.
    • Second Assertion:

      difference=$((cache_size_with_HDD_ADVISE_NO_CACHE_disabled - cache_size_with_HDD_ADVISE_NO_CACHE_enabled))
      
      # Ensure the cache was decreased while the HDD_ADVISE_NO_CACHE was enabled
      assert_less_than ${minimum_difference} ${difference}
      • Calculates the difference in cache size between the two write operations.
      • Asserts that the cache size decreased by at least 1 GiB when HDD_ADVISE_NO_CACHE was enabled, indicating that less data was cached.

Potential Issues and Areas for Improvement:

  1. System Cache Measurement Variability:

    • Measuring total system cached memory can be unreliable due to other system processes and background activities affecting the cache.
    • The test may produce inconsistent results if other applications are using or releasing memory during the test.
  2. Root Permissions Required:

    • The drop_caches function typically requires root permissions (echo 3 > /proc/sys/vm/drop_caches).
    • Running this test as root or with elevated privileges may not be safe or possible in all environments.
  3. Large File Size:

    • Writing a 16 GiB file can consume significant disk space and time, potentially causing the test to exceed the timeout or fail on systems with limited resources.
  4. Use of Direct I/O:

    • Using oflag=direct bypasses the client-side cache, which is intended.
    • However, it ensures that the data is written directly to the chunkserver, and any caching behavior is due to the chunkserver's interaction with the OS.
  5. Assertions May Be Misleading:

    • The use of assert_less_than may be confusing. It checks if the first argument is less than the second, meaning we are asserting that minimum_difference < difference.
    • This can be counterintuitive when we expect a decrease in cache size.

Suggested Improvements:

  1. Reduce File Size:

    • Change file size to 1 GiB to make the test faster and less resource-intensive.
      file_size=1024  # 1 GiB
  2. Use a More Precise Cache Measurement:

    • Instead of total system cached memory, measure the cache usage of the specific file.
    • Option 1: Use tools like vmtouch or fincore to check if the file is cached.
    • Option 2: Write a small script using mincore to check page residency.

    Note: This requires additional tools or scripts and may not be feasible in all environments.

  3. Avoid Root Permissions:

    • If possible, avoid using drop_caches to prevent the need for root permissions.
    • Instead, accept that the cache may not be entirely clean but rely on the relative differences in cache usage.
  4. Add Logging in Chunkserver:

    • Modify the chunkserver to log when posix_fadvise with POSIX_FADV_DONTNEED is called.
    • Verify from the logs that HDD_ADVISE_NO_CACHE is causing the expected system calls.
  5. Adjust Assertions for Clarity:

    • Use assert_greater_than for better readability when expecting the difference to be significant.
    • Alternatively, calculate and assert the cache increase/decrease directly.
  6. Ensure Isolation:

    • Run the test on a system with minimal background activity to reduce variability.
    • If possible, run the test inside a container or VM dedicated to testing.

Improved Test Script:

# Improved test for HDD_ADVISE_NO_CACHE functionality.

timeout_set 2 minutes

CHUNKSERVERS=1 \
  MOUNT_EXTRA_CONFIG="sfscachemode=NEVER" \
  CHUNKSERVER_EXTRA_CONFIG="HDD_ADVISE_NO_CACHE = 0" \
  setup_local_empty_saunafs info

cd "${info[mount0]}"

# Function to get system cached memory size in kilobytes.
function getSystemCachedMemorySizeKB {
  grep '^Cached:' /proc/meminfo | awk '{print $2}'
}

saunafs_wait_for_all_ready_chunkservers

# Set sleep time after dropping caches.
sleepAfterDropCaches=5

# Ensure the cache is clean (requires root privileges).
if [ "$(id -u)" -ne 0 ]; then
  echo "This test requires root permissions to drop caches."
  test_add_failure "Root permissions required."
  exit 1
fi

echo "Dropping caches..."
drop_caches
sleep ${sleepAfterDropCaches}

# Get the original cache size.
original_cache_size=$(getSystemCachedMemorySizeKB)

# Write a big file to populate the cache.
echo "Writing a big file to populate the cache."
file_size=1024  # 1 GiB
dd if=/dev/zero of=file bs=1M count=${file_size} oflag=direct &> /dev/null

# Measure cache size after writing.
cache_size_no_cache_disabled=$(getSystemCachedMemorySizeKB)

# Calculate the increase in cache size.
increase_no_cache_disabled=$((cache_size_no_cache_disabled - original_cache_size))

echo "Cache increase with HDD_ADVISE_NO_CACHE disabled: ${increase_no_cache_disabled} kB"

# Restart chunkserver with HDD_ADVISE_NO_CACHE enabled.
echo "Enabling HDD_ADVISE_NO_CACHE and restarting chunkserver."
echo "HDD_ADVISE_NO_CACHE = 1" >> "${info[chunkserver0_cfg]}"
saunafs_chunkserver_daemon 0 restart
saunafs_wait_for_all_ready_chunkservers

# Ensure the cache is clean again.
echo "Dropping caches..."
drop_caches
sleep ${sleepAfterDropCaches}

# Overwrite the file to re-populate the cache.
echo "Overwriting the file to re-populate the cache."
dd if=/dev/zero of=file bs=1M count=${file_size} oflag=direct &> /dev/null

# Measure cache size after overwriting.
cache_size_no_cache_enabled=$(getSystemCachedMemorySizeKB)

# Calculate the increase in cache size.
increase_no_cache_enabled=$((cache_size_no_cache_enabled - original_cache_size))

echo "Cache increase with HDD_ADVISE_NO_CACHE enabled: ${increase_no_cache_enabled} kB"

# Define acceptable minimum increase (e.g., 100 MB).
minimum_increase=$((100 * 1024))  # 100 MB

# Assertions:

# Ensure that cache increased significantly when HDD_ADVISE_NO_CACHE was disabled.
if [ ${increase_no_cache_disabled} -gt ${minimum_increase} ]; then
  echo "Cache increased as expected when HDD_ADVISE_NO_CACHE was disabled."
else
  test_add_failure "Cache did not increase as expected with HDD_ADVISE_NO_CACHE disabled."
fi

# Ensure that cache increase is significantly less when HDD_ADVISE_NO_CACHE is enabled.
if [ ${increase_no_cache_enabled} -lt ${increase_no_cache_disabled} ]; then
  echo "Cache increase is less when HDD_ADVISE_NO_CACHE is enabled."
else
  test_add_failure "Cache increase did not decrease with HDD_ADVISE_NO_CACHE enabled."
fi

# Ensure that the difference in cache increase is significant.
difference_increase=$((increase_no_cache_disabled - increase_no_cache_enabled))
minimum_difference=$((100 * 1024))  # 100 MB

if [ ${difference_increase} -gt ${minimum_difference} ]; then
  echo "Cache increase difference is significant: ${difference_increase} kB"
else
  test_add_failure "Cache increase difference is not significant."
fi

Explanation of Improvements:

  • Reduced File Size to 1 GiB:

    • Makes the test faster and reduces resource consumption.
  • Adjusted Cache Measurement:

    • Measures the increase in cache size after each write operation relative to the original cache size.
    • Compares the increases directly for clarity.
  • Improved Assertions:

    • Uses direct comparisons with if statements for better readability.
    • Checks that the cache increase is significant when HDD_ADVISE_NO_CACHE is disabled.
    • Verifies that the cache increase is less when HDD_ADVISE_NO_CACHE is enabled.
  • Root Permission Check:

    • Checks if the script is running as root before attempting to drop caches.
    • Provides a clear error message if root permissions are not available.
  • Informative Output:

    • Prints detailed messages about the cache increases and test steps.

Additional Notes:

  • Dependency on Root Permissions:

    • The test still requires root permissions to drop caches. If this is not acceptable, the test should be modified to avoid dropping caches, though this may reduce reliability.
  • System Variability:

    • Running the test on a system with minimal background activity will yield more consistent results.
  • Alternative Verification Methods:

    • For more precise cache measurements, consider using tools like vmtouch or implementing a custom script using mincore, if feasible.
  • Logging Verification:

    • To confirm that HDD_ADVISE_NO_CACHE is functioning at the chunkserver level, you may add logging statements in the chunkserver code to record when posix_fadvise is called with POSIX_FADV_DONTNEED.

Conclusion:

The improved test provides a clearer and more reliable way to verify that the HDD_ADVISE_NO_CACHE configuration in the SaunaFS chunkserver affects the system's caching behavior as expected. By reducing the file size and adjusting the measurement and assertion methods, the test becomes more practical and easier to interpret.

If you have any questions or need further assistance with this test or any other aspect of SaunaFS testing, please let me know!

I did a merge of both approaches to not have two tests testing the same. Main changes: