Miserlou / NoDB

NoDB isn't a database.. but it sort of looks like one.
https://blog.zappa.io/posts/introducing-nodb-pythonic-data-store-s3
380 stars 45 forks source link

Allow using custom AWS S3 host for docker development #17

Closed ivansabik closed 11 months ago

ivansabik commented 6 years ago

Problem

Need to be able to run NoDB in local environments, where AWS S3 can be some random container and not actually AWS S3 (for example using minio)

Solution

Manual testing

  1. Create a local environment of s3 with docker (I tested with minio and listening on port 9000). After that create a NoDB instance, change the host to be the local one:
    
    (.venv) britecorio@Ivans-MacBook-Pro:~/git/NoDB(master⚡) » docker run -p 9000:9000  minio/minio server /data
    Created minio configuration file successfully at /root/.minio

Drive Capacity: 31 GiB Free, 60 GiB Total

Endpoint: http://172.17.0.2:9000 http://127.0.0.1:9000 AccessKey: XCLV2JO1CXZV1U02DN55 SecretKey: cC6z0P3WhoPIZCSG7eK48/S12o93B2MTsc97Cse2


2. Log in to minio locally with the generated keys (http://localhost:9000/) and create a new bucket named `people`:
![image](https://user-images.githubusercontent.com/815440/40320482-7e21558c-5cf1-11e8-9ae1-0c95efe3830c.png)

3. Run this script (as .py, ipython, etc) 
```python
import os

from nodb import NoDB

# CHANGE THESE TO BE THE ONES YOUR DOCKER MINIO GENERATED
os.environ['AWS_ACCESS_KEY_ID'] = 'XCLV2JO1CXZV1U02DN55'
os.environ['AWS_SECRET_ACCESS_KEY'] = 'cC6z0P3WhoPIZCSG7eK48/S12o93B2MTsc97Cse2'

nodb = NoDB()
nodb.aws_s3_host = 'http://localhost:9000'
nodb.bucket = 'people'
nodb.index = 'name'
nodb.human_readable_indexes = True

user = {'name': 'Rony', 'age': 4}
nodb.save(user)
  1. Refresh local browser and check that the file got stored to local s3: image
coveralls commented 6 years ago

Coverage Status

Coverage increased (+1.0%) to 57.037% when pulling 40ea16f94afbb408d8b9a4ba2b3758e64f22d31c on ivansabik:master into 26124840dfffcf6234b4b067531a8ec40d7362dd on Miserlou:master.

ivansabik commented 6 years ago

@Miserlou added test to keep coverage, ready for review

ivansabik commented 6 years ago

I would be glad to rebase this one if you think it will ever make it to merging

nephlm commented 6 years ago

Interesting use case. I haven't used minio before, but it looks like an interesting tool.

I don't think I have permissions to do a proper code review, but I've been doing them all day at work today so here are some thoughts:

Perhaps generates the resource object once and store it as a private instance var and subsequent requests to the property returned the stored value rather that make a new resource each time. I don't know how expensive instantiating the resource is, but it's got to be more expensive than caching it.

I assume that having a endpoint_url of None is the same as not passing one in. Should there be a test for this use case?

In the test perhaps use isinstance() instead of str(type()).

Should there be an update to the documentation demonstrating this usage?

It's not up to me, but just some thoughts.

ivansabik commented 6 years ago

We use minio for local dev testing of s3 without needing to hit AWS. We do not use it in production at all. Agree with your comments, I could do those if this PR would ever be merged into this repo. Otherwise I'll keep using my fork :)

bendog commented 5 years ago

what about doing something like this to reduce the number of S3 resource inits and also allow more than just one type of s3 resource override?

class NoDB(object):
    ...
    _s3 = None
    _s3_used_reource_settings = None
    s3_resource_settings = {}
    ...

    @property
    def s3(self):
        if not (self._s3 and self._s3_used_reource_settings == self.s3_resource_settings):
            # if the s3 reource settings have chanced or s3 is not initialised
            # setup the internal s3
            self._s3 = boto3.resource(
                's3',
                config=botocore.client.Config(signature_version=self.signature_version),
                **self.s3_resource_settings
            )
            # setup the last used resource settings
            self._s3_used_reource_settings = dict(**self.s3_resource_settings)
        # return the internal s3
        return self._s3