dart-lang / webdev

A CLI for Dart web development.
https://pub.dev/packages/webdev
212 stars 74 forks source link

Debugging scripts on HTML pages dynamically generated by another HTTP server #386

Open hoylen opened 5 years ago

hoylen commented 5 years ago

Please provide a way to debug client-side Dart scripts that are referenced by HTML pages produced by a separate HTTP server.

It seems webdev serve currently only supports debugging client-side Dart scripts referenced by static HTML files (which must also be served by the webdev serve process). That makes it suitable for single-page-applications (where the HTML comes from a static file), but not for dynamically generated HTML pages (where the HTML comes from a separately running HTTP server).

Possible options

There are a number of possible approaches. I've described three below, but there could be others.

In the following:

Option 1: webdev serve proxies requests for the DHTML server

Add a feature to allow webdev serve to forward all HTTP requests it does not handle to the DHTML server, and returning the result from the DHTML server as its response. Any HTTP requests it can handle (i.e. corresponding to files it finds in the directory and generated by the DDC) it serves itself without involving the DHTML server.

In this approach, all requests from the browser are sent to webdev serve, and only some of them are forwarded to the DHTML server.

  1. The request for the HTML goes to webdev serve, which forwards the request to the DHTML server, the HTML is generated and returned to webdev serve, and webdev serve returns it to the browser.
  2. The browser parses the script tag in the HTML and finds a relative URL.
  3. The browser then sends a request for the script to webdev serve, which returns the DDC generated JavaScript.
  4. The browser also retrieves the associated resources from the webdev serve.
  5. The browser also retrieves static content from the webdev serve.

Status

Ths currently does not work, because webdev server does not have a proxy feature.

This had been proposed in issue #61, but it should be reconsidered.

Pros and cons

The DHTML server does not need to change between debugging and production - reducing the risk of introducing bugs, as well as the need for it to detect the mode it is runnning in. The DHTML server is configured to always try to serve the files produced by webdev build, but when debugging (when those files don't exist) the browser requests for them simply never reach the DHTML server.

Option 2: The DHTML server proxies requests for webdev serve

Need to ensure it is possible for the DHTML server to identify requests that are intended for webdev serve, so it can forward them on to it. Meanwhile, all requests itself handles are processed normally (including returning 404 Not Found for requests neither it nor webdev serve handles).

In this approach, all requests from the browser are sent to the DHTML server, and only some of them are fowarded to webdev serve.

  1. The request for the HTML goes to the DHTML server, which generates and returns it.
  2. The browser parses the script tag in the HTML and finds a relative URL.
  3. The browser then sends a request for the script to the DHTML server, which forwards the request to webdev serve, which returns the DCC generated JavaScript, and the DHTML server returns it to the brower.
  4. The browser also retrieves the associated resources from the DHTML server, which proxies the request through to webdev serve.
  5. Requests for the static content are also proxied through the DHTML server to webdev serve.

Status

This currently does not work, because the URLs for the associated resources are not documented. Implementers of the DHTML server cannot correctly implement a proxy that works with webdev serve.

I used to have this working with my DHTML server, for Dart 2.0 through to Dart 2.2. But changes with Dart 2.3 and/or webdev 2.0.5 broke it. Through trial and error, I've managed to proxy as many associated resources as I can. But there are still some URLs that don't work (e.g. the browser asks for _/packages/build_web_compilers/src/dev_compiler/dartsdk.js.map but webdev serve returns 404 Not found); and there is some strange behavour with paths (e.g. the Dart script is in "/scripts/foo.dart" and the browser requests "/foo.digests": that returns a 404 on webdev serve, but "/scripts/foo.digests" works).

This is being proposed in issue #223.

Pros and cons

Requires every DHTML server to implement proxying, as well as to change its behaviour between debugging and production. Extra work required for every DHTML server.

There might be issues integrating this with third-party tools (e.g. when webdev serve is lauched by WebStorm).

Option 3: Support running webdev serve separately from the DHTML server

Ensure that all the content served by webdev serve works when it comes from a different source (i.e. different host and/or port) from where the HTML came from.

In this approach, there is no proxying at all.

  1. The browser requests the HTML page from the DHTML server.
  2. The browser parses the script tag and finds a absolute URL (i.e. one with an explicit host/port for the webdev serve).
  3. The browser then fetches the script from _webdev serve.
  4. The browser fetches the associated resources from webdev serve.
  5. The browser fetches the static content from the webdev serve.

Status

This currently does not work, because the DCC generated JavaScript does not reference the associated content relative to where it got the DCC generated JavaScript from. That is, the browser tries to load some of the associate content from the DHTML server rather than from webdev serve. It is probably trying to find them from the page's base URL rather than relative to where the JavaScript came from.

Pros and cons

Requires the DHTML server to change its behaviour between debugging and production: the URLs for the scripts and static content must be changed.

Browser cross-site loading issues need to be addressed. This might mean the DHTML server needs to send a different set of HTTP header when running in debugging vs production.

I prefer option 1.

jakemac53 commented 5 years ago

It seems webdev serve currently only supports debugging client-side Dart scripts referenced by static HTML files (which must also be served by the webdev serve process). That makes it suitable for single-page-applications (where the HTML comes from a static file), but not for dynamically generated HTML pages (where the HTML comes from a separately running HTTP server).

I think something else is going on here, webdev doesn't care about html files explicitly at all. It injects its handlers for hot reload etc directly into the generated JS files.

Option 1: webdev serve proxies requests for the DHTML server

I don't like the idea of adding this functionality to webdev. Maintaining some sort of proxy functionality would end up being a lot of work and a source of continual bugs and feature requests that are outside of the core competency of the package. It sounds easy at first but it gets complicated quickly.

Need to ensure it is possible for the DHTML server to identify requests that are intended for webdev serve, so it can forward them on to it. Meanwhile, all requests itself handles are processed normally (including returning 404 Not Found for requests neither it nor webdev serve handles).

It doesn't have to have a hardcoded list of files to proxy, it could send everything to webdev and if it gets a 404 fall back on other servers. That isn't ideal from a latency perspective but it would work.

Or do the opposite and only fall back on webdev if you can't directly serve the requested resource, which might have better performance.

Option 3: Support running webdev serve separately from the DHTML server

This might work if you put a <base href="http://localhost:8080/"> tag in your html page, which could be swapped out for production to something else?

I believe we will see that and use it internally. We have some other logic around discovering the base url for the app that could potentially also be tweaked to make this work.

hoylen commented 5 years ago

I think something else is going on here, webdev doesn't care about html files explicitly at all.

Yes, webbed doesn't care. But the browser does care where the different files come from, when it determines the URL it uses to retrieve the related scripts/files and to determine if it is safe to load them.

Option 2: DHTML server proxies requests for webdev serve

It doesn't have to have a hardcoded list of files to proxy, it could send everything to webdev and if it gets a 404 fall back on other servers

Yes, sending "all other URLs" to webdev serve avoids needing to know what to proxy. But for my particular DHTML server that is difficult to implement, because I already have a rule that handles "all other URLs".

Even if I modify it, to somehow support two ways of trying to resolve "all other URLs", there still seems to be problems with webdev serve resolving _dartsdk.js.map and other URLs the browser asks for (see original comment for details). So I think there is some work still needed on the webdev side to make this option possible.

Option 3: Running webdev serve separately from the DHTML server

I did try adding a <base href="..."> to my HTML page, but it still didn't work properly.

The other problem with using a base tag is it affects everything else on the page (e.g. other links, CSS), so they cannot be relative URLs. Also, it means the HTML page cannot use base for its own purposes.

If some tweaks could be made (so changing the URL for the Dart-generated JavaScript will work) that sounds like a good solution. In theory, that is one URL to change, rather than every URL on the page.

However, I suspect cross-site scripting issues will make this harder to get working (e.g. the HTML page is coming from localhost:8080/foo/bar.html but the browser gets the scripts from a different "site" localhost:53322/scripts/main.dart.js). In debug mode, the cross-site scripting protections will need to be disabled.

I think option 2 is better. If a mistake is made between switching from testing to production, the worst that could happen with option 2 is the client-side scripts can't run; but with option 3 the developer could unknowingly open up a cross-site scripting security hole. Both options require the DHTML server to behave differently between debugging and production.

Do you want me to write a simple DHTML server, so you can run it and see the problems first hand?

jakemac53 commented 5 years ago

there still seems to be problems with webdev serve resolving _dartsdk.js.map and other URLs the browser asks for (see original comment for details)

The dart_sdk.js.map issue is because we intentionally don't include that file in the build, but it is listed as a source map in the JS since it was originally compiled with a source map (the source map for the sdk itself is not usable though today).

The 404 for the digest file sounds like a separate bug.

I did try adding a <base href="..."> to my HTML page, but it still didn't work properly.

The other problem with using a base tag is it affects everything else on the page (e.g. other links, CSS), so they cannot be relative URLs. Also, it means the HTML page cannot use base for its own purposes.

Ya I was mostly just wondering if that resolved your issues or not. I think that it is true the cross site scripting issues will end up causing a lot of pain anyways so it probably isn't really worth exploring that deeply, even though it would be the easiest to manage from a server perspective.

Another potential option though would be to serve all your application resources from a different, synthetic directory. For instance you could prefix your JS paths with /resources/ or /dart/ or some other path, and then use that initial uri segment as a signal that you need to proxy those requests to webdev. You would need to strip the first uri segment before forwarding it on, but that might work out.

zTrix commented 5 years ago

@jakemac53 actually this is a very common requirements in modern web development.

For web developers (especially developing dynamic web app loading data using RESTful API), it's a very common scenario that frontend developer use webpack serve to support hot module reload and proxy all RESTful API requests to some sort of testing/staging api server(which is developed by another API developer guy).

Without upstream proxy feature, frontend developer will face the issue that browser forbids XMLHttpRequest from another domain if Access-Control-Allow-Origin header not set.

I think this issue should be reconsidered carefully because it can affects developing efficiency very badly.

Examples:

const proxy = require('http-proxy-middleware');

module.exports = function(app) {
  app.use(proxy('/api', { target: 'http://localhost:5000/' }));
};
jakemac53 commented 5 years ago

Those are also good examples of exactly why I don't think we should add support into webdev though - there are a lot of random features included in that (proxying certain paths, rewriting urls, etc etc).

I think the right solution is some sort of mechanism for plugging in generic server middleware. Figuring out how to plug that in is complicated though.

hoylen commented 5 years ago

I've created an example in https://github.com/hoylen/webdev_proxy. It has a DHTML server that proxies to webdev serve all HTTP GET requests for resources it doesn't handle.

In production mode (serving files from the build directory created by webdev build) it works correctly. Other than the client-side script's .js.map file does not exist, so the example responds with a HTTP 404 Not Found. Chrome ignores it, but other browsers log it in their console.

In debug mode (proxying HTTP GET requests to webdev serve) the client-side script produces an error, despite proxying through every HTTP GET request. The webdev serve returns HTTP 404 Not Found (which is passed back to the browser) for several *.js.map files and for the _/examplescript.digests (even though the script is actually in a subdirectory _/scripts/examplescript.dart). See Chrome's console for the exact error (which comes from the client.js file).

It is interesting to note that swapping the use of the DartDevCompiler around makes it work: serving files from a build directory created with "webdev build -r", and proxying requests to "webdev serve -r". But obviously running it like that is not useful.

Thiltal commented 5 years ago

Until last update, I've simply proxy with nodejs http-proxy-middleware. This tool cannot handle sse. So I've tried to solve an issue whit dart shelf_proxy, same big issue - main method does not run due to hot reload. I prefer a way how to complete disable hot reload to not destroy current environment.

    import 'package:shelf/shelf_io.dart' as shelf_io;
    import 'package:shelf/shelf.dart' as shelf;
    import 'package:shelf_proxy/shelf_proxy.dart';

    void main() {
      Uri uri = Uri.parse("http://localhost:4201/");
      shelf.Handler shelfProxyHandler = proxyHandler(uri);
      String pathToRewrite = "game";
      shelf_io.serve((serverRequest) {
         var requestUrl = uri.resolve(serverRequest.url.toString());
         if(requestUrl.pathSegments.isNotEmpty){
           if(requestUrl.pathSegments.contains("\$sseHandler")){
              // We have really big issue here
           }
           if(requestUrl.pathSegments.first == pathToRewrite){
             return shelfProxyHandler(serverRequest.change(path: pathToRewrite));
           }
         }
         return shelfProxyHandler(serverRequest);
      }, 'localhost', 8090)
          .then((server) {
        print('Proxying at http://${server.address.host}:${server.port}');
      });
    }
jimmyshiau commented 5 years ago

@Thiltal My workaround is override content of client.js, replace /$sseHandler? with http://localhost:$_webdevPort/$sseHandler?

make browser request Server Sent Events to ddc server

Future _handleDDC(HttpConnect connect) async {
....
if (path.endsWith('webdev/src/serve/injected/client.js')
      && !path.endsWith('client.js.errors')) {
    final resp = connect.response,
          request = await HttpClient().getUrl(Uri.parse('http://localhost:$_webdevPort$path')),
          response = await request.close();

    String result;

    //Override sseHandler url for handle Server Sent Events from ddc
    await for (var contents in response.transform(Utf8Decoder())) {
      result = contents.replaceAllMapped(new RegExp(r'/\$sseHandler\?'), (Match m) {
        return 'http://localhost:$_webdevPort/\$sseHandler?';
      });
    }

    resp.headers.add(HttpHeaders.contentTypeHeader, "application/javascript;charset=utf-8");
    resp.write(result);
    return resp.close();
jakemac53 commented 5 years ago

Interesting that the SSE requests aren't handled by the proxies - the main reason we use that over web sockets is because it tends to play nicer with proxy servers.

I wonder what is specifically blocking things from working there, its essentially just normal GET/POST requests with the keep alive header (maybe that isn't getting forwarded properly).

fsw commented 5 years ago

@jakemac53 in my case, my trivial proxy was indeed not supporting the keep-alive header but I think it is not just a matter of forwarding the header but actually keeping the connection alive?

Also I might be missing something here but it seems that ajax requests done by client.js are using relative paths, but EventSource call has an absolute path to "/$sseHandler" witch is a bit inconsistent and makes it hard to work with html's base tag.

jakemac53 commented 5 years ago

I think it is not just a matter of forwarding the header but actually keeping the connection alive?

As long as it supports also the text/event-stream content type also then I think it will work? It needs to actually stream that data back to the client as it gets it though for sure.