Open CodyCBakerPhD opened 6 months ago
This problem is actually SO bad on my older computer (which has similar architecture to laptops we've seen students use at user days) that I can't even run the benchmarks once without filling up temp space (~250 GB total in User folder; maybe < 100 GB free)
Also lesson to learn here; the location of such a cache really should not be the boot drive - the OS might take most of that and especially on remote servers is usually very slim - I have additional mounted volumes that are meant for bulk space such as fsspec
is using here
Eventually, after enough runs of the benchmarks, the
fsspec
+ caching test fills up my temporary directory (2 TB in size) with filesand the benchmarks themselves throw errors such as
This seems related to the caching mode
fsspec
uses where it 'reserves' space on disk equivalent to the size of the file and fills in the bytes as requests are received - for small files this is intuitive and fine but for large files it's a pain due to issues like thisA user reported a similar pain with this kind of cache when they tried setting (accidentally) their cache inside an automatically syncing Google Drive folder, which overloaded both their I/O and WiFi speeds, slowing their computer to a crawl (and maxing out their drive storage)
Just something to be aware of in general when assessing the default caching for
fsspec
, but in the meantime...I think I expressed doubts about the automatic cleaning functionality of
tempfile.TemporaryDirectory.cleanup()
before; I highly recommend we follow thepytest
strategy of keeping a global folder (also possibly inlocal/temp
but under a reserved name) that we can send repeatedshutil.rmtree
commands to both at the beginning and end of benchmark runs (therefore giving enough leeway for file locks to have released over time)