bentoml / yatai-image-builder

🐳 Build OCI images for Bentos in k8s
14 stars 9 forks source link

Undocumented permissions needed for "ensure image exists" check #48

Open tmyhu opened 11 months ago

tmyhu commented 11 months ago

After installing yatai-image-builder 1.2.12 according to the documentation for ECR and creating a BentoRequest, I get this error in the logs:

yatai-image-builder-7967749646-zl6d8 manager 1.6993992561239429e+09 ERROR   Failed to reconcile BentoRequest.   {"controller": "bentorequest", "controllerGroup": "resources.yatai.ai", "controllerKind": "BentoRequest", "BentoRequest": {"name":"example","namespace":"yatai"}, "namespace": "yatai", "name": "example", "reconcileID": "67095ba4-ff81-4343-ae49-043877ef4172", "bentoRequest": "example", "bentoRequestNamespace": "yatai", "error": "ensure image exists: check image ACCOUNTID.dkr.ecr.us-west-2.amazonaws.com/yatai-bentos:yatai.example.zg42etd5xoau55v5 exists: list ECR images: NoCredentialProviders: no valid providers in chain.
(...)

The documentation only mentions permissions that need to be configured for yatai-image-builder-pod serviceaccount in yatai namespace. After adding the same permissions for yatai-image-builder serviceaccount in yatai-image-builder namespace the error is resolved.

It would be great to have that documented with the minimum permissions needed for yatai-image-builder.

lurecas commented 9 months ago

Hey @tmyhu , I am also configuring yatai-image-builder and I am facing a similar issue, although I am using GCR instead of ECR. I haven't fully understood how did you solve it, can you share a little bit what you did?

I know this is months before, but I wanted to take a shoot, this is dragging me from fully testing Yatai. Thanks!

tmyhu commented 9 months ago

Hi @lurecas , for ECR I had to add the eks.amazonaws.com/role-arn annotation in the yatai-image-builder Helm values under serviceAccount.annotations. This gave the yatai-image-builder service account access to ECR.

I don't think that is applicable for you with GCR though, just looking at the yatai docs again it seems like for GCR you need to fill out the Helm values under dockerRegistry with the credentials e.g. dockerRegistry.username and dockerRegistry.password so if you are getting a login error to GCR you may need to double check those values?

P.S. Since you are at the testing stage, you may be interested to know that one of the people at Bento told me on Slack that they plan to deprecate and discontinue Yatai, so I've abandoned our evaluation.

lurecas commented 9 months ago

Thanks @tmyhu, after banging my head to the wall for several hours, I managed to fix it. The error was not very clear, but the root cause was the authentication with GCR. I created the secret incorrectly, I encoded the service account file but I did not need to.

P.S. Since you are at the testing stage, you may be interested to know that one of the people at Bento told me on Slack that they plan to deprecate and discontinue Yatai, so I've abandoned our evaluation.

Ok, that's a little bit disheartening 😓 . Was this posted in a public channel or was in a direct message? Thanks for sharing this info.

Let me ask you one more thing, have you evaluated another replacement for BentoML/Yatai? We are currently looking for alternatives to our "custom" MLflow based architecture, and we are doubting if we should go "full vendor locking" with something like VertexAI or SageMaker or try to adopt BentoML (well, the Yatai integration was one of the main draws of Bento, so I don't know)

yetone commented 9 months ago

@lurecas

You can successfully complete this configuration according to the official Yatai documentation, which means first downloading the JSON file from GCR, and then using the content of the JSON file as the password for docker registry.

yetone commented 9 months ago

@lurecas

According to Google's documentation, the docker registry username is _json_key, and the password is the content of the JSON file.

image

lurecas commented 9 months ago

Thanks for jumping in @yetone , I finally made it work. Probably it was my fault, I had some problems fully understanding this authentication method.

Since google is deprecating Container Registry in May, do you folks intend to support Artifact Registry instead? Thanks again

yetone commented 9 months ago

@lurecas

The configuration methods for Artifact Registry and Container Registry are exactly the same, there is no difference at all, please feel free to configure.

tmyhu commented 9 months ago

@lurecas It was a private message so not sure about public comms.

Let me ask you one more thing, have you evaluated another replacement for BentoML/Yatai?

Yes, we've evaluated KServe and Ray and will be going with Ray though both seemed like very valid options for us.