run-ai / docs

markdown docs
69 stars 60 forks source link

(Chromium) Run:ai binary download documentation page misses the part that allows download #202

Open wvxvw opened 1 year ago

wvxvw commented 1 year ago

This page: https://docs.run.ai/admin/researcher-setup/cli-install/#install-runai-cli reads:

However, when pressing on Researcher Command Line Interface the user is simply redirected to the documentation page again.

Other users reported that when using Google Chrome or Brave they are actually able to reach the download form, so, it may be a compatibility issue. The users who succeed in reaching the download page also show the screenshot with the drop-down

to look different (the Researcher Command Line Interface item doesn't have an "arrow poking out of the box" icon next to it).

wvxvw commented 1 year ago

NB. The same happens in Firefox.

❯ firefox --version
Mozilla Firefox 110.0
❯ chromium --version
Chromium 110.0.5481.100 Arch Linux
yarongol commented 1 year ago

Let me describe the expected behavior. It may not be ideal and require some change/explanation, but it may clarify the above:

So,

Let me know

wvxvw commented 1 year ago

As such, the control plane needs a URL to the cluster to access a Kubernetes service

Thanks for quick reply! However, this part isn't clear to me. Why does it need this URL? Everything is happening inside the cluster, why does it need an external URL to the cluster?

Or do you mean runai program isn't included in any of the images and needs to be downloaded separately from them? If so, why does it need it's own URL rather than the URL of the site it needs to be downloaded from?

Or is this a problem with identifying which version of Run:ai is being used? Then why not include this information as some kind of Kubernetes entity like annotation or configuration map etc, so that no trip outside and back into the cluster's network would've been necessary?

Or, is it expected that users (both admins and researchers?) will use runai program from a network outside the cluster? I don't even know if in my case that would be possible... I'd have to check which parts of the Kubernets API (if any at all) are exposed to the external world...

Finally, it would've been so much easier if it didn't require a URL with domain name, but would take an IP, as this information is a lot easier to manipulate and to make consistent across different network boundaries these requests are meant to cross. Is there a reason it has to be a domain name?

You can see the URL by going to the Clusters via and seeing the URL under the cluster on the RHS.

This is what it look like for me:

Cluster Name Status Cluster UUID Created URL Version
c1 Disconnected 56773d81-1237-4c9d-9edc-e7765d664f5c 11/3/2022, 7:04 PM
run:ai cluster Connected 48022b64-73b9-462b-afdd-49e402017aa2 11/20/2022, 5:07 PM https://ose-b92-u2004-02-23.tld 2.8.8

I don't know what c1 or run:ai cluster are. I'd have to ask the other guy who wrote the deployment for this to better understand what happened here. When I switch to the run:ai cluster, then the steps described in documentation work.

There is a way to circumvent the dialog box

Thanks for this. Much appreciated. This would seem to me like a much easier way of obtaining the program. Why not make this the default? Also, does it mean that my earlier question about the program being distributed with one of the images is answered in the positive?

I still need to understand who's the target audience for this program and how the target audience is supposed to interact with the service, so, sorry if I'm missing something obvious.

yarongol commented 1 year ago

"Why does it need this URL?" - This is part of our current architecture. The URL needs to be accessed from outside the cluster, but not from outside the company, so it can (and should) be protected by a firewall. This access allows us to send commands from the UI to (for example) submit jobs and also to download the binary.

"Or do you mean runai program isn't included in any of the images" - Its not an image. As far as I know, it's embedded in the researcher-service image and accessed by running wget against the researcher-service Kubernetes service.

"it would've been so much easier if it didn't require a URL with domain name," It needs to be a domain with a valid certificate in order for the UI to access it safely.

"I don't know what c1 or run:ai cluster" - these seem both like two cluster objects

Finally, we are considering changing this architecture which has not proved to be super successful, but it will take time.

There are two ways to submit jobs. The Command line interface which is the subject of the above discussion and the user interface. As we progressed, the UI became more prominent and the CLI is sort of taking a backstage.

If what I wrote is not clear enough, you may consider pinging us via support channels https://docs.run.ai/#how-to-get-support

wvxvw commented 1 year ago

Hi again. I think that due to how we set up Kubernetes, and how, in general, we expect users to interact with it, the documented way of obtaining CLI client and the help from troubleshooting both prove to be very difficult to make work (i.e. in our deployment we cannot do port forwarding in the way suggested by troublshooting advise).

In the end, and for anyone who's interested, this is what I ended up doing:

#!/bin/sh

set -xe

SHA=$(ctr -n k8s.io image ls -q | grep researcher-service@sha | cut -d: -f2)
BLOBS=/var/lib/containerd/io.containerd.content.v1.content/blobs/sha256
CLIENT=runai-cli/runai-cli-linux-amd64

for layer in $(grep -e digest "$BLOBS/$SHA" | tail +2 | cut -d: -f3 | tr -d '"') ; do
  if [ "$(tar -tvf $BLOBS/$layer | grep $CLIENT)" != "" ] ; then
    tar --strip-components=1 -zxf "$BLOBS/$layer" "$CLIENT"
    break
  fi
done

It will probably work in a similar way with docker instead of crt if Docker is used as Kubernetes container engine, but you will need to find the location where images are stored.


I will also try to explain the source of my confusion, so, hopefully, it gives the developers a better picture of how the users might expect the product to work while at the same time alerting the users coming from the same perspective I do.

In our setup, we expect that when users run distributed workloads, they are working from the same network in which Kubernetes is deployed. The reason for this is that we expect users to have their data close to their execution environment (which wouldn't be possible if our users stored their data eg. on their laptops, but wanted to utilize it in a Kubernetes cluster deployed elsewhere in a datacenter). So, in our model, users only need VPN or SSH access to their resources in the datacenter. Subsequently, we provide an HTTP interface to our users to, eg. run Jupyter notebooks, but users aren't expected to go through public Internet to reach that service (i.e. they would either use VPN or SSH + port forwarding). Some system administrators might set up a public interface to their Kubernetes cluster, but it is by no means a requirement.

This, in turn, means that something like "Cluster URL" doesn't always make sense in our deployment as Kubernetes clusters aren't necessarily (and not typically) accessible from the Internet.