microsoft / windows-rs

Rust for Windows
https://kennykerr.ca/rust-getting-started/
Apache License 2.0
10.36k stars 486 forks source link

Optimize `write_to_file` in `windows-bindgen` to avoid unnecessary file writes #3079

Closed kennykerr closed 4 months ago

kennykerr commented 4 months ago

Alternative to #3072 to avoid unnecessary file writes that trigger side effects in build systems and IDEs.

I ended up doing the same thing for cppwinrt years ago as msbuild and razzle had a real problem for similar reasons.

sivadeilra commented 4 months ago

Unfortunately, I think there's a compelling reason not to do this unconditionally. Many build systems expect that the timestamps on the output files of a task will be newer than the timestamps of the inputs of a task, after running a task. If we don't update the output timestamps, then build systems will re-run bindgen every time. So this breaks incremental builds in environments that use timestamps.

Maybe this could be a configuration option?

kennykerr commented 4 months ago

Interesting, the github workflow doesn't seem to like it either.

sivadeilra commented 4 months ago

Maybe it could be a boolean parameter to bindgen: Always update output? If the output reads the same, then maybe we just "touch" the output to update its last-modified time. I don't see a way to do that directly in std::fs, so maybe we just call std::fs::File::open with the "open existing for write" parameters, then do nothing, then close the file. I'm not sure if that's guaranteed to update the mod time though.

And just to clarify -- I don't mean it has to be a single always_update_output: bool parameter. I just mean an args flag, or a config flag, or whatever.

ChrisDenton commented 4 months ago

I'm not sure if that's guaranteed to update the mod time though.

File::set_times was stabilized a few versions ago.

sivadeilra commented 4 months ago

Ah, good. I wasn't aware of File::set_times. That's perfect.

Still, I would like this behavior to be controllable. The caller should be able to choose between "do a single unconditional write" vs. this diffing behavior. I say this because my team manages build systems (the Windows build lab), and our buidl systems do a lot of introspection into the file access patterns of tools that they execute. This "read and verify" pattern will be flagged as a very problematic behavior by several major build systems because it looks like a false dependency on a previous state, when it's really not.

The default behavior for riddle.exe should be to unconditionally write its output. The "read and verify" pattern should be an opt-in optimization. My rationale is that this have the least risk of causing problems for build systems.

kennykerr commented 4 months ago

For what it's worth, cppwinrt has done this for years without any trouble.

https://github.com/microsoft/cppwinrt/blob/master/cppwinrt/text_writer.h#L178

Anyway if this is not desirable, I'd rather just leave the current behavior alone. If developers have problems with rust-analyzer they can temporarily disable it while working on the windows-rs repo. I never use it for this reason.

We can also separately reduce the overhead and optimize things, like using windows-bindgen rather than riddle in the standalone tests.