intel / intel-xpu-backend-for-triton

OpenAI Triton backend for Intel® GPUs
MIT License
144 stars 44 forks source link

Fixed exception on windows when contention happens in file cache #2787

Open gshimansky opened 4 days ago

gshimansky commented 4 days ago

Catch exception that happens in file cache on Windows when two or more processes attempt to write the same file that is already open by some other process, and therefore cannot be written because it is locked (that's how files work on Windows).

Fixed #2777 .

New contributor declaration

gshimansky commented 4 days ago

While this patch fixes the main source of PermissionError exceptions in FileCacheManager.put function I now see rare occurrences of PermissionError when Triton code tries to read files from the cache. They are possibly unrelated to this particular fix and happened previously too, just much less frequently. I've converted PR to a draft while I continue to investigate and look for a complete fix of xdist tests on windows.