Azure / azhpc-images

Azure HPC/AI VM Images
MIT License
95 stars 77 forks source link

nfs read-ahead setting causes Lustre slowness #293

Closed vgamayunov closed 9 months ago

vgamayunov commented 9 months ago

Troubleshooting an issue with very slow read performance of a Lustre filesystem (~100x) we found that it is caused by the NFS read-ahead setting done by the HPC images.

This is how it is done in hpc-tuning.sh:

cat > /etc/udev/rules.d/90-nfs-readahead.rules <<EOM
SUBSYSTEM=="bdi",
ACTION=="add",
PROGRAM="/usr/bin/awk -v bdi=$kernel 'BEGIN{ret=1} {if ($4 == bdi) {ret=0}} END{exit ret}' /proc/fs/nfsfs/volumes",
ATTR{read_ahead_kb}="15380"
EOM

The value of ATTR{read_ahead_kb}=15380 seem to be applied to all devices, not only NFS, including Lustre mounts.

This is how teh issue can be seen - with 10MB block size the performance drops big time compared to 1MB:

root@ubuntu2004:~# echo 3 > /proc/sys/vm/drop_caches
root@ubuntu2004:~# dd if=/lfs2/test/ubuntu2004/test of=/dev/null bs=1M
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.0562267 s, 1.9 GB/s
root@ubuntu2004:~# echo 3 > /proc/sys/vm/drop_caches
root@ubuntu2004:~# dd if=/lfs2/test/ubuntu2004/test of=/dev/null bs=10M
10+0 records in
10+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 2.50094 s, 41.9 MB/s

Checking the read-ahead value for lustre - it is set to 15380:

root@ubuntu2004:~# cat /sys/devices/virtual/bdi/lustrefs-ffff9b44c8196800/read_ahead_kb
15380

Setting read-ahead to 0 fixes the slowdown:

root@ubuntu2004:~# echo 0 > /sys/devices/virtual/bdi/lustrefs-ffff9b44c8196800/read_ahead_kb
root@ubuntu2004:~# echo 3 > /proc/sys/vm/drop_caches
root@ubuntu2004:~# dd if=/lfs2/test/ubuntu2004/test of=/dev/null bs=10M
10+0 records in
10+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.0857173 s, 1.2 GB/s

Can the udev rule be written in a way that it only affects nfs mounts and not applied to all devices?

edwardsp commented 9 months ago

PR #294 should resolve this - it will at least write the rule as documented here: https://learn.microsoft.com/en-us/azure/storage/files/nfs-performance#increase-read-ahead-size-to-improve-read-throughput

abhamidipati-msft commented 9 months ago

fix is available in the marketplace images microsoft-dsvm:ubuntu-hpc:2004:20.04.2023111801 microsoft-dsvm:ubuntu-hpc:2204:22.04.2023111801