singularityhub / sregistry-cli

Singularity Global Client for container management
https://singularityhub.github.io/sregistry-cli/
Mozilla Public License 2.0
14 stars 18 forks source link

Adding workaround for boto3 metadata casing, and updating licenses #172

Closed vsoch closed 5 years ago

vsoch commented 5 years ago

This pull request will address the following:

@fenz would you care to test? For the docker-compose, you will need to change the tag to be the automated build to vanessa/sregistry-cli from this branch - the one that I changed it to will be the container deployed after merge :)

vsoch commented 5 years ago

Here is the build for the container you can change the docker-compose to to test! It would be like vanessa/sregistry-cli:fix_boto3-casing-bug (it's still building :))

fenz commented 5 years ago

@vsoch I changed the docker-compose like you suggested and I tried, everything seems to be fine:

(base) root@fa99de9faa3f:/code# sregistry search
[client|s3] [database|sqlite:////root/.singularity/sregistry.db]
[bucket:s3://s3.Bucket(name='mybucket')]
Containers
1  test/ubuntu:latest.simg  1-7-2019    29MB

the size is there ;) Thanks for the quick fix.

vsoch commented 5 years ago

awesome! Okay, merging and will update pypi soon.

vsoch commented 5 years ago

Here you go ! https://pypi.org/project/sregistry/0.1.32/

vsoch commented 5 years ago

Thanks again for your help with this @fenz, it was truly a pleasure.

fenz commented 5 years ago

Actually I din't do much. I'll try to test it on another S3 Object Store (NetApp Storage Grid), maybe it would be good to test it on Ceph as well (using s3). In the meanwhile I wanted to ask you something else (please feel free to move the discussion elsewhere if it is not the right place to discuss it). A good thing for me would be to define "bucket policies" for having anonymous "pull". I'm not sure this is a feature needed by the sregistry-cli or something related to the sregistry server (when it gives the chance to store images on a S3 object store instead of writing on the filesystem) but I started some investigation and I tested something.

Push to S3 bucket

import boto3
import json

s3 = boto3.resource('s3',
                    endpoint_url='http://172.17.0.2:9000',
                    aws_access_key_id='minio',
                    aws_secret_access_key='minio123')

## FROM https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-example-bucket-policies.html#set-a-simple-bucket-policy

bucket_name = 'public-bucket'
s3.create_bucket(Bucket=bucket_name )

# Create the bucket policy
bucket_policy = {
    'Version': '2012-10-17',
    'Statement': [{
        'Sid': 'AddPerm',
        'Effect': 'Allow',
        'Principal': '*',
        'Action': ['s3:GetObject'],
        'Resource': "arn:aws:s3:::%s/*" % bucket_name
    }]
}

# Convert the policy to a JSON string
bucket_policy = json.dumps(bucket_policy)

## Is it possible to use the S3 client from the "resource" object
# Set the new policy on the given bucket
s3.meta.client.put_bucket_policy(
    Bucket=bucket_name,
    Policy=bucket_policy
)

#create file to push
f= open('publicTestBucket.txt','w')
f.write("test public content");
f.close()

# upload a file from local file system 'publicTestBucket.txt' to bucket 'public_bucket' with 'publicTestBucket' as the object name.
s3.Bucket(bucket_name).upload_file('publicTestBucket.txt','publicTestBucket')

Pull from public S3 bucket

import boto3
from botocore import UNSIGNED
from botocore.client import Config

s3 = boto3.resource('s3', 
                    endpoint_url='http://172.17.0.2:9000', 
                    config=Config(signature_version=UNSIGNED))

# download the object 'publicTestBucket' from the bucket 'public_bucket' and save it to local FS as publicTestBucket_down.txt
s3.Bucket('public-bucket').download_file('publicTestBucket', 'publicTestBucket_down.txt')

So pulling public images will not need key and secret. I think if you integrate it in the sregistry (server) then you configure the S3 object store and the images policies will be managed by the registry itself and this will not be needed but for the sregistry-cli you may want to have something like the docker client that allows you to push/pull private or public images based on USERNAME/PASSWORD configuration. Let me know what do you think about this

vsoch commented 5 years ago

I don't see why not! Would you mind copy pasting the above into a new issue?

A few questions! If we have to add the UNSIGNED parameter, are you suggesting to make this the default? If not, what would be the user interaction / workflow for both the creator of the bucket, and then a user (that may not know if it's public or not?)

fenz commented 5 years ago

I can open a new issue, I'm just still not sure it makes sense as a feature. The only reason I would have this is if I need to share the image, in this case I just need to share the URL of the S3 object store and the "search" command will show me only the public images (if I didn't set KEY/SECRET). I guess it would be better if I can set "readonly" policy to a single file instead the whole bucket and maybe be able to change this policy to have a "share/unshare" feature because in this scenario there are just 2 entities: the owner of the object store (minio/minio123) and a generic user. Does it more sense to threat this client more like the google drive instead of the docker client?

vsoch commented 5 years ago

@fenz my general intuition is that the permissions should be managed by the registry, and not by sregistry cli that simply interacts (and honors them). In that the sregistry client is linked to one user, it's most logical to me that the default limits access to that user (unless he/she decides to open up to larger audience). That last step I think is out of scope for sregistry client to manage, but again, sregistry-cli would honor permissions that are set in another way.

vsoch commented 5 years ago

For example, minio has a client that would be the right tool to do this -> https://docs.minio.io/docs/minio-client-complete-guide

fenz commented 5 years ago

This is why I wanted to ask for your opinion before starting any new issue :) Anyway I fully agree to have it managed by the Sregistry server. So just let me know how we proceed for the integration with sregistry, should this S3 client be tested with other S3 providers (like Ceph or AWS) before starting the integration with the server?

vsoch commented 5 years ago

If you have the cloud server (minio) and several clients to interact with it, why introduce another layer by adding sregistry? Just curious about the need.

fenz commented 5 years ago

I don't know if I got the question right but I'll try to give an overview of my need. Having the S3 client integrated in the sregistry will allow to store images directly in an object store instead of using a volume mounted as FS. Usually you don't have a lot of storage capacity on the host where you run the sregistry, in principle you can mount any external storage as volume (there's a docker storage driver for S3 as well) but it will add overhead. So idea that when you push an image to the sregistry, this send an "upload" request to the S3 object store instead of writing to the mounted volume. Does this answer your question? And... does it make sense?

vsoch commented 5 years ago

@fenz with the sregistry client and s3, you can already accomplish this goal - sregistry client lives on the host, and containers in storage. If you add sreigstry (server) you are just adding another layer - the sregistry is akin to minio but it just uses it for the storage. And then you would still pull containers from it to your cluster. What do you gain by adding sregistry server over just using the client to pull directly from s3?

fenz commented 5 years ago

@vsoch sorry for the delay. I know you can do the same with the sregistry-cli (and I'll probably do this) but sregistry server gives you the web interface and the users management, it is a complete registry, not just a storage. Actually I started to deploy a sregistry server and I figured out I was not happy with the storage solution because I needed the registry to store on an S3 (for the reasons I mentioned before). The first solution was to mount the S3 object store as volume but I didn't want to have other interfaces involved. To recap, the initial thought was to allow sregistry storage (https://singularityhub.github.io/sregistry/install-server#storage) customization, giving the change to use it with different storage solution (S3 in this case). Anyway this was just to try to explain a bit better my point of view, but I think I will anyway just use the sregistry-cli. This mean your S3 client implementation will work for me and I want to thank you again for your effort.

vsoch commented 5 years ago

Ah I understand, so you want a portal for your users (and not just the client). I can definitely think about how to go about this.

vsoch commented 5 years ago

hey @fenz ! I started working on this yesterday, and will be updating the work here -> https://github.com/singularityhub/sregistry/pull/172

Note that this is a fairly large change (all views and endpoints need to be customized / changed) so set your time expectation appropriately! So far I've just started reorganizing the models and thinking about how the user will specify wanting a custom storage.

fenz commented 5 years ago

@vsoch I was also guessing this is going to take a bit to be implemented so let's say my time expectation is quite flexible. We can continue the discussion about this update in the other thread.