argonne-lcf / dlio_benchmark

An I/O benchmark for deep Learning applications
https://dlio-benchmark.readthedocs.io
Apache License 2.0
62 stars 27 forks source link

Add support for NFS profiler #19

Open johnugeorge opened 1 year ago

johnugeorge commented 1 year ago

Since files are in closed category, we need to add profiler support for NFS. Currently, we have IOSTAT which won't work for NFS.

zhenghh04 commented 1 year ago

@hariharan-devarajan you have better knowledge on the profiler. Any thoughts on this?

hariharan-devarajan commented 1 year ago

Darshan and recorder should work on nfs as the interception happens on the application and VFS level of the file system. IOStat is a system level profiler which will be hard to work with at a distributed level.

If you want metrics from system, I believe ldms work with lustre file system to capture system level IO counters.

tonycasanova commented 1 year ago

Has Netdata been evaluated for metrics collection in the VM under test for metrics? It has a RAM mode to help make the metrics collection lighter on the CPU and I believe it can collect metrics from standard disks and nfs. It is easy to install. https://blog.netdata.cloud/how-to-monitor-your-disks-and-filesystems-now-also-with-ebpf/