Closed vsoch closed 4 years ago
okay here is an update after working today. I was able to at least get a signature (and the associated headers) into the signed url by adding it like this:
from botocore.client import Config
...
s3_external = session.client(
"s3",
use_ssl=MINIO_SSL,
region_name=MINIO_REGION,
endpoint_url=MINIO_HTTP_PREFIX + MINIO_EXTERNAL_SERVER,
verify=False,
config=Config(signature_version='s3v4'),
)
and from this comment I think the (still) signature mismatch is because s3v4 uses the host to validate (which is different inside the container, minio:9000 from outside the container 127.0.0.1:9000).
So my thinking is that one of the following needs to be done:
And here are full logs from minio:
minio
minio [REQUEST s3.HeadBucket] 22:23:56.991
minio HEAD /sregistry/
minio Host: minio:9000
minio X-Amz-Date: 20200405T222356Z
minio Accept-Encoding: identity
minio Authorization: AWS4-HMAC-SHA256 Credential=minio/20200405/us-east-1/s3/aws4_request, SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date, Signature=0ea18b5f4f1a98901dd99c33cc38a164e0018c4282e78499d7d5b04e56b9d2cd
minio Content-Length: 0
minio User-Agent: MinIO (Linux; x86_64) minio-py/5.0.8
minio X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
minio
minio [RESPONSE] [22:23:56.991] [ Duration 171µs ↑ 93 B ↓ 218 B ]
minio 200 OK
minio Vary: Origin
minio X-Amz-Request-Id: 16030CAE690AF085
minio X-Xss-Protection: 1; mode=block
minio Accept-Ranges: bytes
minio Content-Length: 0
minio Content-Security-Policy: block-all-mixed-content
minio Server: MinIO/RELEASE.2020-04-04T05-39-31Z
minio
minio
minio [REQUEST s3.NewMultipartUpload] 22:23:57.186
minio POST /sregistry/test/chonker%3Asha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874?uploads
minio Host: minio:9000
minio X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
minio X-Amz-Date: 20200405T222357Z
minio Accept-Encoding: identity
minio Authorization: AWS4-HMAC-SHA256 Credential=minio/20200405/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=3481893c75d10f99a20c1f4b181971ff845c59a9235076d9576ed71c2a88ea92
minio Content-Length: 0
minio User-Agent: Boto3/1.12.36 Python/3.5.7 Linux/5.3.0-42-generic Botocore/1.15.36
minio
minio [RESPONSE] [22:23:57.186] [ Duration 490µs ↑ 93 B ↓ 580 B ]
minio 200 OK
minio Server: MinIO/RELEASE.2020-04-04T05-39-31Z
minio Vary: Origin
minio X-Amz-Request-Id: 16030CAE74AF4D9B
minio X-Xss-Protection: 1; mode=block
minio Accept-Ranges: bytes
minio Content-Length: 330
minio Content-Security-Policy: block-all-mixed-content
minio Content-Type: application/xml
minio <?xml version="1.0" encoding="UTF-8"?>
<InitiateMultipartUploadResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Bucket>sregistry</Bucket><Key>test/chonker%3Asha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874</Key><UploadId>b6d2946e-c2d3-400d-b2a4-897d421de5f6</UploadId></InitiateMultipartUploadResult>
minio
127.0.0.1 [REQUEST s3.PutObjectPart] 22:23:57.809
127.0.0.1 PUT /sregistry/test/chonker%3Asha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874?partNumber=1&uploadId=b6d2946e-c2d3-400d-b2a4-897d421de5f6&X-Amz-Date=20200405T222357Z&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-SignedHeaders=host&X-Amz-Credential=minio%2F20200405%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Expires=5&X-Amz-Signature=95e243bd0bb9d0103f3cfecd8d60285544289dbfd4afd4ea7fb3e18e3605fede
127.0.0.1 Host: 127.0.0.1:9000
127.0.0.1 Accept-Encoding: gzip
127.0.0.1 Content-Length: 928
127.0.0.1 User-Agent: Go-http-client/1.1
127.0.0.1 X-Amz-Content-Sha256: 2fc597b42f249400d24a12904033454931eb3624e8c048fe825c360d9c1e61bf
127.0.0.1 <BODY>
127.0.0.1 [RESPONSE] [22:23:57.809] [ Duration 2.014ms ↑ 68 B ↓ 818 B ]
127.0.0.1 403 Forbidden
127.0.0.1 Server: MinIO/RELEASE.2020-04-04T05-39-31Z
127.0.0.1 Vary: Origin
127.0.0.1 X-Amz-Request-Id: 16030CAE99BC2C2E
127.0.0.1 X-Xss-Protection: 1; mode=block
127.0.0.1 Accept-Ranges: bytes
127.0.0.1 Content-Length: 561
127.0.0.1 Content-Security-Policy: block-all-mixed-content
127.0.0.1 Content-Type: application/xml
127.0.0.1 <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>SignatureDoesNotMatch</Code><Message>The request signature we calculated does not match the signature you provided. Check your key and signing method.</Message><Key>test/chonker%3Asha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874</Key><BucketName>sregistry</BucketName><Resource>/sregistry/test/chonker%3Asha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874</Resource><RequestId>16030CAE99BC2C2E</RequestId><HostId>7f978b0a-2482-4918-94e4-a7ed69e11a59</HostId></Error>
127.0.0.1
The first ones at the top are when the sregistry uwsgi container is started up / restart, it just creates a MinioClient (which is successfully used for a PUT pre signed URL without the multipart bit). Then for the multipart we need to use s3 clients, those are s3 and s3_internal in minio.py, and those are the following requests. You can see the upload request being created by the s3 client, and then the PUT URL is generated from s3_external.
s3v4 requires preserving host header i.e use the host when generating the pre-signed signature - it is present to avoid MITM attacks on the payload.
@harshavardhana what would you suggest for generating the signed URLs inside the docker container (minio:9000), where the hostname is different from the external client (outside the container, 127.0.0.1:9000). I tried adding the hostname "minio" to point to 127.0.0.1 in /etc/hosts so I could use the s3 client internally for all calls, but I got the same error. Could it be something else, or is there another header or setting that I'm missing?
Do you work at Minio? Is there an equivalent function to generate signed urls, but for a multipart upload? I noticed that the put_object request will use multipart for over some size threshold, but that would be called directly from the client making the request. We'd need to generate the URLS in the same fashion as the single put request.
@vsoch depends on who is going to access the URL to upload/download
To use rotating credentials you can use AssumeRole mechanism which allows for using AWS SDKs directly without generating pre-signed URLs and also without sharing the static credentials.
Your server component should ensure to generate these temporary credentials and pass it to the client
https://github.com/minio/minio/blob/master/docs/sts/assume-role.md
if it's an external client which has only visibility of external IP to the setup then it should be external IP.
@harshavardhana it's an external client, and I do create separate clients (s3 for making the internal request to start the multipart upload, and s3_external to generate the pre signed urls). See here for creating the clients and here for the requests.
Your server component should ensure to generate these temporary credentials and pass it to the client
I could pass this to the client, but I have no control over the client implementation - specifically it's using the scs-library-client so I have no way of changing those calls. Are you saying there is some way of wrapping this with the generation of pre-signed urls to just change how the authentication is done?
I could pass this to the client, but I have no control over the client implementation - specifically it's using the scs-library-client so I have no way of changing those calls. Are you saying there is some way of wrapping this with the generation of pre-signed urls to just change how the authentication is done?
No there is no such thing either you use pre-signed or AssumeRole you have to choose.
@harshavardhana it's an external client, and I do create separate clients (s3 for making the internal request to start the multipart upload, and s3_external to generate the pre signed urls). See here for creating the clients and here for the requests.
If its an external client then you need to use external IP. keeping separate constructors for both is the correct approach, you can, of course, share the connection pooling by passing in the same urllib3 pool manager.
okay, I'll need to run more details by you tomorrow then, because (per our conversation now) I've done everything right thus far, but still there is the signature mismatch! I think it might be related to the (internal) client making the start request, the HostID returned in metadata is empty. I don't know if the UploadID passed on to the external_s3 client then creates a request that tries to match the requesting HostID to the one that started the request (and then gets an empty string vs. what looks like an md5 of something). I think the trick might be figuring out how to define the HostID for the start multipart request, so it matches for the external_s3 signed url request that gets checked against the start request with the UploadID?
Another thing I'll try tomorrow is to see what the "correct" headers look like for a working (single) PUT request here and then maybe I can use those same signing functions (with the correct headers) to update the signature generated for my current (not working) pre-signed urls.
okay, I'll need to run more details by you tomorrow then, because (per our conversation now) I've done everything right thus far, but still there is the signature mismatch! I think it might be related to the (internal) client making the start request, the HostID returned in metadata is empty. I don't know if the UploadID passed on to the external_s3 client then creates a request that tries to match the requesting HostID to the one that started the request (and then gets an empty string vs. what looks like an md5 of something). I think the trick might be figuring out how to define the HostID for the start multipart request, so it matches for the external_s3 signed url request that gets checked against the start request with the UploadID?
It is not just about the correct presigned URL meaning lets say if you generated a presigned for external client and internal client cannot use it. Same as internal clients URL cannot be used external client - both will see signature mismatch due to host header mismatch.
HostID is not useful in the response we can ignore that for now, that's just some unique ID but it is largely not that meaningful in MinIO.
I am generating it with the external client (but from the internal host) and of course using it for the external client. As a sanity check, the same strategy with a minioClient and minioExternalClient generated with minio-py for a single presigned URL for a PUT works like a charm! It’s the multipart support for presigned URL that is missing. This is why I thought it might be useful to print out headers that are generated by minio-py for the working single PUT in case I need to add them for signature generation.
hey @harshavardhana ! I joined the Minio slack and posted a bunch of information - let me know if / when you and/or your team are around so we can debug together! I'll keep trying things in the meantime :)
okay I've just pushed the (now working!) solution for pre-signed multipart uploads - the issue was that the presign_v4 function was using an unsigned hash string, and I needed to copy the function and pass in the sha256 that is sent from my calling client. Finally, I've added the CompleteMultipartUpload view to finalize everything.
@RonaldEnsing @Aneoshun @lmcdasm @asmi10 you've all had recent issues with uploading large images or scaling, and I want to invite you to possibly review this PR if you might have the chance! Speciically, I've added a minio storage backend, which can serve the Sylabs library enpoints and also support multipart upload. I suspect that this would also allow for a custom setup where a center deploys a local Minio setup and then is able to use the same endpoints for the same registry! I would like to have a few testers give this a spin before merging - I'm fairly confident that this is a much better implementation than the current, but I want to have others look for a sanity check.
For those of you that have existing registries - do not test on top of your existing registry! Since minio is a new storage endpoint, I was careful to not have it use the previous folder bound to the host (images) but I still wouldn't want to take any risk of containers being recreated, at least for the time being. Looking forward to hearing what you think! I had a great time working on this... Minio is awesome, and there are so many ways you can further configuration it! -> https://github.com/minio/minio/tree/master/docs/config
My sincere apologies for the delay, the last few weeks have been more than hectic. It is only now that I have finally time to test you incredible PR.
No worries! Times certainly aren't normal right now.
In short: It solves all my issues. The upload/download speed seems better. The memory footprint during upload is marginal, regardless of the size of the image (I can upload a 3GB image on a SRegistry server with only 2GB. That was simply impossible before. I got screenshots of before/after memory usage, the difference is obvious.
This is so great to hear! I was hoping this would be the case.
During the build of the container I see this error, maybe it is fixable:
YAML retrying sregistry uwsgi coreschema simplejson openapi-codec user-agents backcall xmlsec spython googleapis-common-protos wrapt
ERROR: awscli 1.18.66 has requirement rsa<=3.5.0,>=3.1.2, but you'll have rsa 4.0 which is incompatible.
It looks like there is heated discussion about this and it's relatively new so we should stay on the lookout for an update. I've subscribed to the issue and created an issue here, I think that's probably the most I can do - if you try it out and there is an issue with https then we can downgrade. I opened an issue here #300.
It seems that the download button for the uploaded containers has disappeared on the web UI (I can only see the "freeze" button). I don't know if it is like this on purpose or not.
Since we use minio now which requires a more robust auth flow, I thought it was best to remove this.
For some reasons, after uploading 1 container in an empty collection, the collection list (on the web UI) tells me that I have 3 builds. When I click on the collection, to see the container list, I only see one container (the one I just uploaded).
I'm not able to reproduce this. Could it be that you have a postgres container that was used previously and not updated (e.g., you pulled down to your repo to a previous install and didn't remove and re-create the database).
After, reading the documentation, I am a bit confused between the usages of the images and minio-images folders. It looks like I cannot start the docker-compose if both of these folders are created. However, only minio-images seems to be used (as explained in the doc), while images is just filled with empty folders named from ./0 to ./9. I don't know if this is normal.
This exists totally to still provide some sort of fallback support. If a registry needs to somehow exist with both, I decided to use a different name so there isn't conflict. If you use minio, you can safely ignore the images folder, it isn't doing harm being there.
Thanks a lot @vsoch for your work. I am really looking forward to merging this PR and using it daily.
Definitely! I'll push some of the documentation changes soon for you to take a look at. And to debug your 3 containers per collection issue - maybe shell into the server and do a collection.containers.count()? And try to remove and re-create the postgres image?
I'm not able to reproduce this. Could it be that you have a postgres container that was used previously and not updated (e.g., you pulled down to your repo to a previous install and didn't remove and re-create the database).
I cloned this PR in a new folder, stoped my running SRegistry instance, built the new SRegistry containers and start everything from scratch from there. My previous instance has probably hundreds of containers, so if it was a DB collision, then I expect to see a lot more ghost containers.
Definitely! I'll push some of the documentation changes soon for you to take a look at. And to debug your 3 containers per collection issue - maybe shell into the server and do a collection.containers.count()? And try to remove and re-create the postgres image?
Could you walk me through this a bit more? Which on which container of the compose shall I shell into? db
? From there, in a bash shell, where shall I invoke collection.containers.count()
?
Could you walk me through this a bit more? Which on which container of the compose shall I shell into? db? From there, in a bash shell, where shall I invoke collection.containers.count()?
Definitely! It's actually just the uwsgi container that you want. So if you are currently stopped, start containers:
docker-compose up -d`
$ docker-compose ps
Name Command State Ports
--------------------------------------------------------------------------------------------
sregistry_db_1 docker-entrypoint.sh postgres Up 5432/tcp
sregistry_minio_1 /usr/bin/docker-entrypoint ... Up 0.0.0.0:9000->9000/tcp
sregistry_nginx_1 nginx -g daemon off; Up 443/tcp, 0.0.0.0:80->80/tcp
sregistry_redis_1 docker-entrypoint.sh redis ... Up 6379/tcp
sregistry_scheduler_1 python /code/manage.py rqs ... Up 3031/tcp
sregistry_uwsgi_1 /bin/sh -c /code/run_uwsgi.sh Up 3031/tcp
sregistry_worker_1 python /code/manage.py rqw ... Up 3031/tcp
Now we want to shell into the uwsgi container. The worker would give us equivalent access, but I typically use uwsgi.
$ docker exec -it sregistry_uwsgi_1 bash
This has us in the root of the container, /code, which is actually mounted at your repository. Be careful making changes from within the container because it will change permissions on your host, which typically you don't want.
Next, django has a lovely interface via the manage.py module right there. You can easily get different kinds of shells, show urls, and interact with a lot of custom functions exposed by modules installed in the registry. Just try doing:
python manage.py --help
to check it out. Of course we want an interactive Python shell into the application so we can do:
python manage.py shell
Once we are in python, you can import your models, and query them!
from shub.apps.main.models import *
Container.objects.count()
1
Collection.objects.count()
1
for collection in Collection.objects.all():
print(collection)
collection:1
Collection.objects.filter(name="collection")
# <QuerySet [<Collection: collection:1>]>
And of course each collection is related to its' containers:
collection.containers.count()
1
and you can get attributes:
collection.private
False
collection.name
'collection'
collection.owners.first()
<User: vsoch>
So we would be interested in the count of containers in the collection, and if you see 3, to loop over them and print the names, tags, etc.
Thanks! So count() outputs 4
while I only have 2
containers. Here are the names of the containers
<QuerySet [<Container: gitlab-ci/airl_env:DUMMY-2f3832c0-5c9d-4b08-9c65-d769db45f9ba>, <Container: gitlab-ci/airl_env:DUMMY-304812db-c526-40af-bc05-d61368aba5c3>, <Container: gitlab-ci/airl_env:pytorch>, <Container: gitlab-ci/airl_env:base_ci>]>
As you can see I have 2 Dummy containers. I don't know why. They appear in the counts in the UI, but are not listed with the other containers.
Ahh! I might know what happened. After cloning the repo, I copied the config.py file from my original Registry (to have all my settings), which obviously led to an error because of the missing MinIO configuration. Then I had a second issue because I forgot to set the URL of the server here. Because of those two errors, I made two failed attempts at pushing a container. Could it be that the Dummy containers are created at this point, but not removed?
That’s what the tag suggests! :)
There is actually a task that handles cleaning these up, I didn't find a way to do it easily when it happens: https://github.com/singularityhub/sregistry/blob/a6b0c0d9d9ac67d240ebc295ef610af96d3e84fc/shub/apps/api/management/commands/cleanup_dummy.py. You should be able to run this with manage.py too
python manage.py cleanup_dummy
Does that resolve all the issues then? Ready for merge?
Yes, cleanup_dummy removed this. Thanks a lot! I think it is ready to merge!
This is an extension of #297 that includes multipart based on the example here I've successfully been able to:
However there is some error triggered by the scs-library-client in this function so that it calls the _multipart_abort endpoint and the request does not finish. I have a feeling it has something to do with a parameter missing for the request, for example here are the differences between a working PUT request (non multipart) and a not working single Part:
This URL works (a single PUT request for minio, generated with presigned_put_url
And note that the only way I am able to do this is by instantiating two minio clients, one with an internal url (inside the container the server is minio:9000), and one with an external url (from the client the server is 127.0.0.1:9000). That was tricky but seemed to work (the work in #297).
This next URL is generated based on the example here and I again needed to create an s3 client to handle an internal request to start the multipart upload:
And then a separate client (s3_external) to generate URLs with the upload id generated from above:
The above generates a url that looks like this and does not work as the Singularity client aborts and calls _multiupload_abort:
I was just able to enable debug output for Singularity, and then use the mc client for minio to determine the next issue to address:
Likely I need to look into how to make sure the correct signature version / method is being used, I'm not sure what the default is but I don't think it's the same that minio expects.