pfnet / pfio

IO library to access various filesystems with unified API
https://pfio.readthedocs.io/
MIT License
52 stars 20 forks source link

Avoid calling zip.namelist many times #322

Closed y1r closed 1 year ago

y1r commented 1 year ago

When I tested https://github.com/pfnet/pfio/pull/316 PR with the ImageNet-1K dataset, I found that Zip(FS).stat() performance is pretty bad because:

Therefore, I'd like to cache filenames with the set structure to avoid redundant construction and improve search performance if the archive is opened in read-only mode.