dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.2k stars 1.41k forks source link

Dagit doesn't work when hosted on a path #2073

Closed zzztimbo closed 4 years ago

zzztimbo commented 4 years ago

I've set up a k8s LoadBalancer service for my dagit deployment.

when I curl: curl https://k8s.foo.com/tim/dagster and curl http://<load-balancer-ip>/

I get the exact same content, but in the browser I see a white page for the former url and a working dagit in the second url.

alangenfeld commented 4 years ago

I wonder if we have problems with being hosted on urls with paths, is it possible for you to try dagit-test.foo.com?

zzztimbo commented 4 years ago

I added an entry in my /etc/hosts file for dagit-test.foo.com pointing to my the load balancer ip and it works fine.

image
alangenfeld commented 4 years ago

Thanks for the report!

bengotow commented 4 years ago

Hey folks! Yep this is correct—you cannot attach the Dagit web service to an arbitrary path prefix. The app doesn't have any way of knowing that the path prefix is meant to be ignored and just thinks that it does not match any path that is part of the application. (That blank screen is essentially a 404, though the web app doesn't render those errors currently!)

If you really want to run dagit from https://k8s.foo.com/tim/dagster, you need to configure Nginx (or whatever service is managing the routing in this case) using something like proxy_pass to forward requests to the Dagit service + port and re-write the requests so dagit believes the prefix was never present.

Hosting it on a subdomain is definitely easier if that's also alright 🙏

zzztimbo commented 4 years ago

Hey @bengotow

I believe I'm doing the correct thing with my k8s ingress:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: dagster-ingress
  namespace: tim
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/enable-cors: "true"
spec:
  rules:
  - http:
      paths:
      - path: /tim/dagster/
        backend:
          serviceName: dagster-lb-service
          servicePort: 80

I'm still seeing the blank page.

muthugit commented 4 years ago

hi… is there any option to run dagit UI with custom base_url instead of simple /

i want to run the dagit like: http://localhost:3000/engine

bollwyvl commented 4 years ago

Would very much like to see this! Another use case is when proxied inside of a larger application, where part of the URL is used for another purpose, e.g. the username on JupyterHub.

bengotow commented 4 years ago

Hey folks! Just a heads up that I'm working on this today and it looks like it will be possible to mount Dagit on a path, we just need to pass configuration between the python process and the React web app. Will keep you posted 🙏

bengotow commented 4 years ago

Hey folks—I've landed support for this via a new --path-prefix=/dagster option you can pass to the dagit process and it'll ship in 0.8.7. Stay tuned!

schrockn commented 4 years ago

@bollwyvl would love to see the dagit in JupyterHub case if you are able to get to it. Sounds really awesome

bollwyvl commented 4 years ago

Welp, bit of a snag there... The recent hard grpcio pin is giving me some trouble. But as soon as that is cleared up! The first demo will be a quick jupyter-server-proxy, and an iframe: click launcher button, get dagit. Will probably look nice with the dask extension.

What I really want to demo is pieces of the dagit UI embedded properly inside jupyterlab, next to the code of interest... And doing all that right will be... Involved. But thank goodness for typescript. And in the backend, since flask isn't on the async loop, there will be... Complications. But if it works...

On Tue, Jul 7, 2020, 15:13 Nicholas Schrock notifications@github.com wrote:

@bollwyvl https://github.com/bollwyvl would love to see the dagit in JupyterHub case if you are able to get to it. Sounds really awesome

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dagster-io/dagster/issues/2073#issuecomment-655067470, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAALCRBHC2QLUXKDOMFJMV3R2NXWPANCNFSM4KIPIYAQ .

bollwyvl commented 4 years ago

Ha, that took me a sec! Congrats on 0.9.1: here's the first crack at getting everything running on binder (gist), as a reference example of running in JupyterHub, with everything (kernel, database, dask, dagit) all running on one node: a real hub deploy would move some of those into services, probably.

Sometimes the iframes do some frame-y stuff that doesn't respect the --path-prefix (sorry, don't remember what). It's been a while since i looked at all the config settings, so I'll have to relearn those. But things are looking very promising!

bollwyvl commented 3 years ago

Happy to report that everything looks swell with 0.11.1:

Binder

I've yet to fully work up how to more cleanly integrate Dagster into a one-click service for JupyterHub, but it's looking quite promising.

RMHogervorst commented 2 years ago

I think there are still some issues, it seems the assets on the main page in dagit are on an absolute path

jonasdebeukelaer commented 2 years ago

Hey also trying to host on a path here. I have nginx proxy_passing monitor.dev-1.company.tech/dagster/ to the dagit service.

I've deployed in helm but the comments seems to suggest hosting on path is still not possible?

values.yaml snippet:

dagit:
    # Ingress hostname for Dagit e.g. dagit.mycompany.com
    # This variable allows customizing the route pattern in the ingress. Some
    # ingress controllers may only support "/", whereas some may need "/*".
    # NOTE: Dagit doesn't yet support hosting on a path, e.g. mycompany.com/dagit.
    path: '/dagster/*'
    # NOTE: do NOT keep trailing slash. For root configuration, set as empty string
    # See: https://github.com/dagster-io/dagster/issues/2073
    host: 'monitor.dev-1.company.tech'

and when trying to load https://monitor.dev-1.company.tech/dagster/ I just get a blank page with a bunch of 404s and:

<noscript>You need to enable JavaScript to run this app.</noscript>

edit: nginx config snippet

upstream dagster {
    server dagster-dagit.dagster:80;
    keepalive 15;
}

server {
    listen 4181;
    absolute_redirect off;

    location /dagster/ {
    include snippets/proxy_auth.conf;
    proxy_pass http://dagster/;
    }
}

Is this setup right?

jonasfd commented 2 years ago

Hello, I'm trying to run dagit inside AWS SageMaker Studio. It's essentially an instance of jupyter lab hosted in AWS. To access the dagit server running inside the jupyter lab instance, I'm trying to use the same approach we use for other web-based tools such as Tensorboard. However, when I try to access dagit web interface using the jupyerlab proxy-server, I get the white page.

I tried three approaches, trying to make it work:

  1. Simple execution dagit -f cereal.py --log-level debug And when I go to <my-jupyerlab-base-address>/jupyter/default/proxy/3000/, I get the white page even though the log messages have: INFO: 127.0.0.1:36836 - "GET / HTTP/1.1" 200 OK

  2. Using the prefix running: dagit -f cereal.py --log-level debug --path-prefix /dagit When I go to <my-jupyerlab-base-address>/jupyter/default/proxy/3000/dagit the server redirects me to <my-jupyerlab-base-address>/dagit/ and I get a Unsupported URL path on my screen. On the server side, the log has: INFO: 127.0.0.1:54204 - "GET /dagit HTTP/1.1" 307 Temporary Redirect

  3. Trying a kludge with the prefix: dagit -f cereal.py --log-level debug --path-prefix jupyter/default/proxy/3000/dagit For this one, whatever I try to access I get a Not Found message and logs like this:

    INFO:     127.0.0.1:55230 - "GET /dagit HTTP/1.1" 404 Not Found
    INFO:     127.0.0.1:46498 - "GET /dagit/workspace HTTP/1.1" 404 Not Found

    This is probably because of how the proxy is redirecting the requests to the server, which makes sense.

Does anyone have ever succeeded in running dagit from inside jupyer lab or another similar proxy? Thanks!

MarcSkovMadsen commented 1 year ago

I would really also like to be able to launch dagster from the command line in my Jupyterhub with jupyter-server-proxy installed.

I have tried the below without success

dagit -l /mt-fumo/user/masma/proxy/3000

When I navigate to https://xyz.mydomain.com/mt-fumo/user/masma/proxy/3000/, it redirects me to https://xyz.mydomain.com/mt-fumo/user/masma/proxy/3000 and I see

image

dcarrillo-eog commented 1 year ago

Does anyone found a solution to implement this in the dagster helm chart?

airlangga-gunawan-faculty commented 10 months ago

Hi is anyone working on implementing base path support? It's blocking us using Dagster on Sagemaker as well

bvallier commented 10 months ago

Feel like this needs to be re-opened... the app is served on the path prefix, but the assets (js, css) are still pinned to the root of the app... @alangenfeld ?

alangenfeld commented 10 months ago

Locally running the latest version dagster-webserver --path-prefix /foo is working for me. I see js/css files served from /foo/ . Can you provide more context on what you are running and what you are observing?

For the jupyter-server-proxy related questions, I believe this setting is whats required to get things corrected https://jupyter-server-proxy.readthedocs.io/en/latest/server-process.html#absolute-url

bvallier commented 10 months ago

Thanks - I'm running in prod and deploying to a shared path via ECR with latest dagster-webserver package (1.5.8) and dagster libraries. Here's the page source when i visit. Doesn't seem like the pathPrefix is populating here?

<!doctype html>
<html lang="en">
    <head>
        <meta charset="utf-8"/>
        <meta name="viewport" content="width=device-width,initial-scale=1,shrink-to-fit=no"/>
        <meta name="theme-color" content="#000000"/>
        <script type="application/json" id="initialization-data">
            {
                "pathPrefix": "",
                "telemetryEnabled": false
            }</script>
        <script nonce="deed3c9c8ddc46db9e132619903bbaab">
            __webpack_nonce__ = "deed3c9c8ddc46db9e132619903bbaab"
        </script>
        <link rel="manifest" href="/manifest.json" crossorigin="use-credentials"/>
        <link rel="icon" type="image/png" href="/favicon.png"/>
        <link rel="icon" type="image/svg+xml" href="/favicon.svg"/>
        <title>Dagit</title>
        <script defer="defer" src="/static/js/main.a2ed00b7.js" nonce="deed3c9c8ddc46db9e132619903bbaab"></script>
        <link href="/static/css/main.cc1499ae.css" rel="stylesheet">
    </head>
    <body>
        <noscript>You need to enable JavaScript to run this app.</noscript>
        <div id="root"></div>
    </body>
</html>

Getting a redirected too many times notice. This is the cmd i'm issuing in Dockerfile and Service Worker: CMD ["dagster-webserver", "-h", "0.0.0.0", "--path-prefix", "/prod/emea/data-dagster/app"]

alangenfeld commented 10 months ago

How exactly is your container being run? Is it possible something is overriding CMD? Have you tried ENTRYPOINT? When I run dagster-webserver --path-prefix /path/prefix/example -h 0.0.0.0 I see

2023-11-15 15:58:22 -0600 - dagster-webserver - INFO - Serving dagster-webserver on http://0.0.0.0:3000/path/prefix/example in process 70482

in the logs.

What do you see in your container logs?

bvallier commented 10 months ago

We are deploying to a Kubernetes cluster.

So I'm getting this: 2023-11-15 21:36:49 +0000 - dagster-webserver - INFO - Serving dagster-webserver on http://0.0.0.0:3000/prod/emea/data-dagster/app in process 1 (same as you).

Can try ENTRYPOINT and see if that gets me somewhere...

alangenfeld commented 10 months ago

Given the log output indicates that its receiving the path prefix argument, the next thing I would check is that you are routing to the expected container from the ingress and not hitting some other dagster-webserver process that was launched without the path prefix argument.

bvallier commented 10 months ago

thanks so much Alex - I feel like I'll need to talk to a network admin to get an ingress controller installed at the cluster level, because I can't seem to insert the prefix on those assets using just --path-prefix in the docker container. The endpoint is shared so it needs to be a predefined url path for my repo.

bvallier commented 10 months ago

I think it has something to do with how the traffic is routed once it hits the endpoint. The prefix isn't getting marshalled along...

This is without the prefix arg:

Host: apps.factory.xxx.aws-emea.xxx.com Origin: https://apps.factory.xxx.aws-emea.xxx.com Referer: https://apps.factory.xxx.aws-emea.xxx.com/prod/emea/data-dagster/app

bvallier commented 10 months ago

i have this working btw, configuration error on my end. All is good here. thanks for the support!

saybrian commented 5 months ago

for anyone else wanting to host at "/dagster" and deploying via the helm chart, these 3 properties all need to be set: