Open sjg20 opened 4 months ago
Because labgrid could at some point switch away the hardware from underneath the kernel, i.e. when using a USB SD mux or sd wire device. The write cache needs to be flushed either way, so my expectation would be that you'll have to wait for the cache to be flushed during a sync
either way.
Thanks for the info
If I understand this correctly, fdatasync should be enough for syncing - directio is not needed and just slows things down. If the hardware disappears while writing or before syncing, then the dd will fail either way.
Perhaps the solution here is to use separate 'sync' after the dd?
@sjg20 I suspect that would work; is it actually any faster to do it that way?
Yes with rock2 u-boot-rockchip.bin (8898660 bytes)
With my change: Image written in 5.7s Without it: Image written in 10.6s
I am using quite slow media.
When writing large images (more than a few 100MiB, on hosts with just 1-2 GiB of RAM) without oflag=direct
, useful data is discarded from the page cache, disrupting other workloads. Also, dd's progress data is mostly useless while writing to the cache.
What's the dd
cmdline that's used in your case? Perhaps it's using 512 byte blocks and your min-io size is larger, triggering write-modify-write cycles (see lsblk -t
).
Better approaches are definitely possible (perhaps MADV_PAGEOUT
/MADV_DONTNEED
/sync_file_range()
with some sliding window, depending on what works with blockdevs), but that's a lot more complex than using dd
which is available everywhere. bmaptool claims to be faster, so you might try that.
That sounds like an edge case to me (low memory). I could make it use direct if the size is larger than 20MB, perhaps?
$ lsblk -t /dev/sdk NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE RA WSAME sdk 0 512 0 512 512 1 mq-deadline 2 128 0B
I do need skip and seek most of the time
the two versions are:
fast: dd if=/var/cache/labgrid/sglass/1a14c6e43b9fc3e3f68498f4bf72cec3e0de503cac147be7b27f2f5d7fbe682a/u-boot-rockchip.bin of=/dev/sdk bs=512 skip=0 seek=64
slow: dd if=/var/cache/labgrid/sglass/4da536865256736eaa1747b40bc1e90aeab44127b83a0e682046b766bc0b20ce/u-boot-rockchip.bin of=/dev/sdk oflag=direct bs=512 skip=0 seek=64 conv=fdatasync
I notice that the fdatasync doesn't slow things down on the one case I am testing here (so we can use that instead of a separate 'sync').
It is the direct I/O that is the problem
That sounds like an edge case to me (low memory). I could make it use direct if the size is larger than 20MB, perhaps?
The current implementation is driven by the use-cases we know about, like our lab with hundreds of places split into 16 per exporter (PCEngines APU with 4GB RAM). Many of them use USB-SD-Muxes, writing 500MiB-2GiB images. Keeping the influence on tests running in parallel is critical there and drove the change you cited. That's far from an edge case.
I'd be open to a driver-level attribute to disable oflag=direct, perhaps write-cache
, defaulting to false?
Do you get sensible progress output from dd without oflag=direct? Previously it would report high speeds until write-back starts and then hang a long time at the end, which was confusing to users.
What sort of storage device are you using?
Thanks for the background as to why this was done.
This is using uSD cards.
Re he driver-level attribute, would that need each board to put the attribute in its environment? Is there some overall setting that could be used? Using direct I/O seems to only be a win for large images on machines with not much memory.
When writing an 8MB image to a board I see this:
The last bit seems to be the 'dd'.
If I drop the 'oflag=direct' from USBStorageDriver.write_image I get:
which is a bit better. Why is direct I/O needed?
This doesn't make a lot of sense to me. Why not let the kernel handle the caching?