CoolSpring8 / rwppa

将浙江大学网页版 RVPN 模拟为本地 HTTP 代理 - (ZJU) RVPN Web Portal Proxy Adapter (CGI Proxy to HTTP Proxy)
GNU General Public License v3.0
26 stars 3 forks source link

problems about WebTextHandler #2

Open CoolSpring8 opened 4 years ago

CoolSpring8 commented 4 years ago

Currently we have this in internal/proxy/proxy.go:

// WebTextHandler fixes links in page and "src" issues in javascript files.
// This solution, however, may prevent content streaming. Fix it?
func WebTextHandler() goproxy.RespHandler {
    return goproxy_html.HandleString(
        func(s string, ctx *goproxy.ProxyCtx) string {
            c := RVPNURLMatcher.ReplaceAllString(s, "$1://")
            rawURLWithPort := ctx.UserData.(reqData).rawURLWithPort
            rawURLWithoutPort := ctx.UserData.(reqData).rawURLWithoutPort
            c = strings.ReplaceAll(c, rawURLWithPort[:strings.LastIndex(rawURLWithPort, "/")+1], "") // possible out of bounds?
            c = strings.ReplaceAll(c, rawURLWithoutPort[:strings.LastIndex(rawURLWithoutPort, "/")+1], "")
            return c
        })
}

As the comments indicates, there are several problems.

  1. Issue of "src" in javascript files is not completely resolved. Corruption of scripts still exists on some sites.
  2. Out of bounds in trying to get the string before (including) last "/".
  3. Streaming of response. Apparently for now the result is first "ReadAll"-ed, processed and sent to the browser.

Solutions and plans:

And one more small thing: CSS files can probably be excluded from being applied with this handler.

CoolSpring8 commented 4 years ago

For 3: Well, I'm not sure if browsers really start rendering one element before its downloading finishes.

The current behavior is that "Waiting (TTFB)" takes a long time, and "Content Download" phase finishes like lightning (as rwppa runs locally).

As introducing streaming text replacement will add overhead and requires more work, it would be better to know if this have a perceivable impact on user experience.

An extra bonus would be being more stable and have a lower memory usage.

So, if having decided to optimize this: