liormizr / s3path

s3path is a pathlib extension for AWS S3 Service
Apache License 2.0
206 stars 39 forks source link

Do not import boto3 when using PureS3Path #164

Closed iamthebot closed 3 months ago

iamthebot commented 3 months ago

This is different from #122 -- basically, we're using PureS3Path in a CLI. Currently, as written, importing s3path also imports boto3, does some initialization, etc. This is relatively slow which impacts the CLI responsiveness. We've made the import lazy as a workaround but this means not being able to eg; use PureS3Path for type hints.

Rather than making boto3 an optional dependency, can we make the boto3 import/initialization not occur for PureS3Path?

liormizr commented 3 months ago

Hi @iamthebot So you are fine with boto3 as a requirement/dependency how ever you don't want the initialization Just for curiosity, how much time it takes?

In any case I'll think when kind of lazy can we do, maybe we can solve issue #122 at the same time Two questions:

  1. do smart-open import initialization boto3 as well?
  2. can we do the change only for python version +3.12?
iamthebot commented 3 months ago

@liormizr yeah, boto3 is totally fine as a requirement/dependency. That's totally reasonable. I guess if you wanted to address #122 technically boto3 could be an optional requirement (and if it's not available only PureS3Path is usable).

What would be nice to do is delay boto3 import/initialization until it's needed for eg; S3Path. This makes the import of this package very fast and we only pay the initialization cost when it's needed. This way merely importing (but not instantiating) aPureS3Path or S3Path are both fast.

I believe we can do this only for Python 3.12 onwards. I don't think this needs to be backported, tbh.

liormizr commented 3 months ago

@iamthebot how much time boto3 adds?

iamthebot commented 3 months ago

@liormizr that initialization is variable but between 100-300ms last I tested it. P99 of 500ms but that's fairly rare.

liormizr commented 3 months ago

@iamthebot I learned something new :blush: In my env I got now 90ms with boto3 ~100ms and without ~15ms In average

BTW smart-open also imported boto3 so it need's to be lazy as well.

PR is ready I'll try to push a new version this week

You are more then welcome to review

iamthebot commented 3 months ago

Thank you @liormizr! Happy to test it on Friday (pretty busy today and tomorrow) if that's not too late.

iamthebot commented 3 months ago

This LGTM!

liormizr commented 3 months ago

Version 0.5.3 deployed

iamthebot commented 3 months ago

@liormizr can you also merge this PR to release the conda-forge package? Thanks!

liormizr commented 3 months ago

@iamthebot we found an issue with our laze import new version will be deployed today PR #167 Issue: #160 (See last comments)

liormizr commented 3 months ago

0.5.5 was deployed with the fix. @iamthebot I'll deploy to conda-forge package version 0.5.5

liormizr commented 3 months ago

@iamthebot conda-forge package was deployed