Closed TheDarkTrumpet closed 1 year ago
Hi @TheDarkTrumpet - thank you so much for the feedback. We've been getting tons of feedback on the Yatai architecture lately, particularly on how it fits into existing Kubernetes infrastructure and how to easily configure Yatai to use existing cluster resources. That is what led to the constant changes in the main branch for both Yatai and Yatai helm chart in the past week or two. We are also in the middle of a big refactoring that will drastically simplify the installation process for users, both for prototype and for production use. We expect it to be released and become stable in the next 2-3 weeks.
cc @yubozhao @yetone could you help confirm the current stable version of Yatai and yatai helm chart?
Thanks @parano for the quick feedback on this. I'm also glad to see this is something that'll be simplified, and I found the tunnel change to be a welcome thing (vs editing /etc/hosts and using minikube tunnel). Also explains the number of ways I've seen it break over the past few days.
If there's a stable chart/version that I can use, I'd appreciate that. The presentation is on Wednesday at noon, and it'll be covered toward the end. I have a script I've been working on for the hands-on portion (after slides). https://github.com/TheDarkTrumpet/MLOpsTest/tree/master/bentoml/cat-toy -- it's the last bit, but if I can show parts of it that would be great, if not I can mention how it works more in theory and try to follow up with information after things stabilize a bit.
@TheDarkTrumpet thanks for the info
I would like to set up a zoom call and help you get things running before your demo on Wednesday.
You can reach me in the community slack channel (join here)or you can email me at bo@bentoml.com
@TheDarkTrumpet Thanks for your feedback. Yatai now does not need minikube tunnel
anymore. According to your feedback, I think there is a problem with your local DNS resolution, can you tell me what the following command returns?
dig +short yatai-minio-yatai-infra.192.168.49.2.sslip.io
Thank you everyone for the replies. I really appreciate the support in all this.
@yetone - The dig didn't return anything, but I assume it was supposed to resolve to 127.0.0.1, I added it to my /etc/hosts
I did run a full dig (without short), and got the following:
; <<>> DiG 9.10.6 <<>> yatai-minio-yatai-infra.192.168.49.2.sslip.io
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52789
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;yatai-minio-yatai-infra.192.168.49.2.sslip.io. IN A
;; Query time: 49 msec
;; SERVER: 10.0.3.20#53(10.0.3.20)
;; WHEN: Mon Aug 01 06:02:47 CDT 2022
;; MSG SIZE rcvd: 74
If this was meant to run in the container itself, I can exec -it it and rerun (assuming dig is on the nginx layer, which I think is the proxy/load-balancer for all this?)
The push exhibited the same error.
@yubozhao Thanks, I'll email you. I'm quite open in time, and if we can get it working that would be greatly appreciated.
@TheDarkTrumpet I think there is something wrong with your local DNS nameserver 10.0.3.20
, can you fix it or use an official DNS nameserver like 8.8.8.8
?
Hi @yetone
I did as requested, and used 8.8.8.8, and dig shows up fine. The resolution itself does come up (please see below), but that doesn't resolve the issue (also listed below). I did do a docker inspect on the minikube container, and it does show the ip address of 192.168.49.2. Which, it's throwing a filter issue for both that IP and the gateway IP.
โฏ bentoml models push cat_toy:latest
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Failed to upload model "cat_toy:xfqtpfqrbw4a3lg6" โ
โ Failed pushing model "cat_toy:xfqtpfqrbw4a3lg6" : HTTPConnectionPool(host='yatai-minio-yatai-infra.192.168.49.2.sslip.io', port=80): Max retries exceeded with url: /yatai/models/default/cat_toy/xfqtpfqrbw4a3lg6.tar.gz?AWSAccessKeyId=cbk7sdr6ulbc73cdlab0&Expires=1659407643&Signature=SgQ9aPyKYโฆ โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
Uploading model "cat_toy:xfqtpfqrbw4a3lg6" โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 0.0% โข 0.0/28.0 kB โข ? โข -:--:--
๎ฒ ~/programming/personal/mlOpsTests/bentoml/cat-toy ๎ฐ master ๎ฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ๎ฒ mlOpsTests Py ๎ฒ 20:34:03 ๎ฐ
โฏ scutil --dns | grep 'nameserver\[[0-9]*\]'
zsh: command not found: scutil
๎ฒ ~/programming/personal/mlOpsTests/bentoml/cat-toy ๎ฐ master ๎ฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ๎ฒ 127|1 โ ๎ฒ mlOpsTests Py ๎ฒ 20:34:32 ๎ฐ
โฏ cat /etc/resolv.conf
#
# macOS Notice
#
# This file is not consulted for DNS hostname resolution, address
# resolution, or the DNS query routing mechanism used by most
# processes on this system.
#
# To view the DNS configuration used by this system, use:
# scutil --dns
#
# SEE ALSO
# dns-sd(1), scutil(8)
#
# This file is automatically generated.
#
search local
nameserver 8.8.8.8
๎ฒ ~/programming/personal/mlOpsTests/bentoml/cat-toy ๎ฐ master ๎ฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ๎ฒ mlOpsTests Py ๎ฒ 20:34:42 ๎ฐ
โฏ dig +short yatai-minio-yatai-infra.192.168.49.2.sslip.io\
>
๎ฒ ~/programming/personal/mlOpsTests/bentoml/cat-toy ๎ฐ master ๎ฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ๎ฒ INT โ ๎ฒ mlOpsTests Py ๎ฒ 20:35:55 ๎ฐ
โฏ dig +short yatai-minio-yatai-infra.192.168.49.2.sslip.io
192.168.49.2
We took a look at it tonight, and I really wanted to thank everyone for their help this evening. I don't take credit for any of the below, and greatly appreciate the guidance in this and hope the documentation is useful. It appears that it could be an issue with minikube (1.26.0), although some more testing is necessary. Full script to get this working (start to finish) is below:
minikube delete
minikube start --cpus 4 --memory 6096
minikube addons enable ingress
helm install yatai yatai/yatai -n yatai-system --create-namespace
Wait awhile, can run kubectl -n yatai-components get ing
and when receiving something like the below, then go on to the next step:
โฏ kubectl -n yatai-components get ing
NAME CLASS HOSTS ADDRESS PORTS AGE
yatai-minio nginx yatai-minio-yatai-infra.192.168.49.2.sslip.io 192.168.49.2 80 76s
Download the following yaml file to a good location
apiVersion: v1
kind: Service
metadata:
name: ingress-nginx-controller-lb
namespace: ingress-nginx
spec:
ports:
- appProtocol: http
name: http
port: 80
protocol: TCP
targetPort: http
- appProtocol: https
name: https
port: 443
protocol: TCP
targetPort: https
selector:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
type: LoadBalancer
and run minikube tunnel
in one terminal, and the following in another:
kubectl -n ingress-nginx apply -f YOUR_FILE.yaml
Then type:
kubectl -n yatai-components edit ing yatai-minio
Scroll down to the host:
line, and change from yatai-minio-yatai-infra.192.168.49.2.sslip.io
to: yatai-minio-yatai-infra.127.0.0.1.sslip.io
After that, run the export/echo commands from the notes/readme:
export YATAI_INITIALIZATION_TOKEN=$(kubectl get secret yatai --namespace yatai-system -o jsonpath="{.data.initialization_token}" | base64 --decode)
echo "Create admin account at: http://127.0.0.1:8080/setup?token=$YATAI_INITIALIZATION_TOKEN" && kubectl --namespace yatai-system port-forward svc/yatai 8080:80
Important to note, there's two blocking terminal processes at this point:
port-forward
kubectl commandTo serve models, some similar editing has to be done. First, edit:
kubectl -n yatai-system edit cm network
And replace domain-suffix: 192.168.49.2.sslip.io
with domain-suffix: 127.0.0.1.sslip.io
Thank you again for all your help this evening. I'm unsure how you want to approach this ticket. If you want to close it, leave it open to look into minikube version as being a potential issue, or if not duplicable try to further debug it. I'm open for however I can help.
@TheDarkTrumpet Thank you for working with us and discovering a potential issue with the minikube installation.
I would like to keep this issue open until we found the source of the issue. And with your detailed guide, we can use it to help the community as well.
This is due to a network limitation of docker under macOS, which has been pointed out in the README and in the doc, so this issue can be closed
Start a minikube Kubernetes cluster: minikube start --cpus 4 --memory 4096, if you are using macOS, you should use hyperkit driver to prevent the macOS docker desktop network limitation
Hello,
I am a bit new to Helm, and Kubernetes even. But I want to give a presentation in a few days surrounding how to use BentoML in a MLOps scenario, and want to include Yatai, as i think it's a fantastic project and want to increase awareness.
I've noticed that often times with helm deploys that the chart is continually changing. Like I'm sitting on main instead of stable. The base of my question is if there's a 'stable' version that can be installed from helm.
The steps I'm doing are incredibly consistent:
I noticed with the newest update that the whole
http://yatai.127.0.0.1.sslip.io/
is now gone, with a tunnel being done differently. The docker component is finally back (that's why I pulled tonight), but now it's broken in a new way:I don't necessarily mind more bleeding edge, but for the demo I'm hoping I can figure out something more stable. Furthermore, when people are evaluating this tool from an architecture level (which is actually my goal in all this), having something stable - even if a bit older, would be of huge help in adoption of something like this.