msiemens / tinydb

TinyDB is a lightweight document oriented database optimized for your happiness :)
https://tinydb.readthedocs.org
MIT License
6.84k stars 550 forks source link

feat: Add Google Cloud Storage support #425

Closed galuszkak closed 3 years ago

galuszkak commented 3 years ago

Hi @msiemens,

I love the idea in this project. I found use case where people could use this with Object storage when using runtimes without volumes/harddrives like Google Cloud Run or AWS Fargate etc. This would be great for small apps where there is one maybe two users and you just need TinyDB with some external storage.

My initial implementation is for Google Cloud, but I could add support for AWS S3 and Azure Blob Storage in future.

Thanks for great project!

galuszkak commented 3 years ago

@msiemens something is wrong with the CI/CD build, because it pulls not code from PR, but from master branch, therefore it's not testing my changes...

galuszkak commented 3 years ago

@msiemens I've digged into this and saw that it's not doing what you would expect from it:

Screenshot 2021-08-13 at 9 07 44 PM
msiemens commented 3 years ago

Thanks for your effort, @galuszkak! I'm sure this is useful for cloud users, but I think the best place for this would be to create a separate extension. That way people can choose if they need cloud storage integration or if they want to go as lightweight as possible 🙂 For example, there's a tinydb-appengine that does add AppEngine support to TinyDB in a separate extension. Would it be possbile for you to also implement Google Cloud Storage support as an extension? If so, I'd be happy to list it in the TinyDB extensions list in the documentation!

galuszkak commented 3 years ago

Hey @msiemens thanks for coming back.

So in my view, this is already implemented as an extension because by default if you do pip install tinydb nothing related to Google Cloud will be installed. If you do pip install tinydb[google-cloud-storage] then that functionality will be unlocked. So any functionality related to Google Cloud is extras/extension to original tinydb.

It's similar to how this is implemented in apache-airflow.

Personally, I don't want to create another python package on pypi with that small codebase, but it's my preference. If you don't believe this is the proper place let's close this PR.

saurabh0719 commented 3 years ago

Hi @galuszkak @msiemens

I came across this PR while trying to build my own solution for an object store on AWS S3. TinyDB seems like a great fit for my use case (for similar reasons as mentioned by @galuszkak).

I'd be happy to work on it for a couple of days and issue a PR to support AWS S3 as a storage backend

OR

If this is not the best place for it; @galuszkak would you be open to the idea of making a single repo where we could contribute and add storage backends for cloud providers? (My primary focus is on S3).

msiemens commented 3 years ago

@galuszkak

If you do pip install tinydb[google-cloud-storage] then that functionality will be unlocked.

Ah, I missed that. Sorry!

Personally, I don't want to create another python package on pypi with that small codebase, but it's my preference.

I guess there's points to be made for both sides. As a user, having lots of small packages increases the chance that some of them are abandoned. Having Google Cloud storage (and other cloud storages) in TinyDB would ensure that it gets maintained as long as TinyDB is maintained. But as a maintainer, I prefer having non-core functionality in external packages as it shares the maintanence and support load upon more shoulders. As much as I'd love to make life as simple as possible for users, I have to admit to the fact that even now it takes for me weeks to answer questions, issues and bug reports.

There are other projects that follow a similar path. My big inspiration is Flask, which isn't that big in itself (it's 8.5k lines of code at the moment) but has tons of packages that extend it in every possible way. But I understand that projects like Apache Airflow take a different approach that uses optional dependencies of some sort.

That being said, for me I'd prefer having a TinyDB codebase that is as small and maintainable as possible and having extra functionality in other PyPI packages. I nonetheless appeciate the effort you put into the Google Cloud Storage extension and would love to link to it in the documentation if you decide to release it as a PyPI package.

msiemens commented 3 years ago

As mentioned above, I'd prefer having this in a separate extension. If someone creates a repository and PyPI package, I'll gladly link to it in the documentation 🙂