danielfrg / s3contents

Jupyter Notebooks in S3 - Jupyter Contents Manager implementation
Apache License 2.0
248 stars 88 forks source link

use async calls for getting info about subdirectories #139

Closed yoel-ross-zip closed 2 years ago

yoel-ross-zip commented 2 years ago

When building the model for a directory, there are repeated synchronous uses of the fs.lstat method. I have found that this can cause some serious latency when working with larger folders in s3, and would like to offer a couple of improvements:

  1. get the metadata for each file once, and use both for removing deleted markers and getting last modified time.
  2. use the underlying s3fs._info method instead of lstat. The _info method is async, and all the calls can be made concurrently with the existing s3fs event loop.