legumeinfo / gcv

Federating genomes with love (and synteny derived from functional annotations)
https://gcv.legumeinfo.org/
Apache License 2.0
40 stars 12 forks source link

use of nginx sub_filter for base href may not be equivalent to ng build --base-ref #377

Closed adf-ncgr closed 2 years ago

adf-ncgr commented 2 years ago

it seems that there are some resources that are being lazily loaded by the app that are not directly referenced in the index.html document and so not governed by its element. At least, this is my interpretation of the experiment of beating my head against the wall trying to get the advertised mechanism using the nginx conf file to work when everything is not being served from the server root. Using the ANGULAR_BUILD_OPTIONS in the Dockerfile in conjunction with that seemed to make it work, although it is indeed a bit of a drag to have to build the app just for this purpose. It's possible I was doing something stupid, but I did verify that the base href was indeed getting set appropriately and the scripts that are directly loaded from the index file were properly loaded.

alancleary commented 2 years ago

That's too bad. Although I'll be the first to admit that the ngx_http_sub_module solution was less than thoroughly tested.

There are other techniques for setting the base HREF without compiling but none of them seemed as slick. One final thought for using ngx_http_sub_module is that we could set base HREF to a placeholder when compiling the app and then update the filter in default.conf to replace all instances of the placeholder instead of just the base tag's HREF attribute.

Thoughts?

adf-ncgr commented 2 years ago

I think I understand the placeholder idea. From what I was reading I think by default sub_filter only applies to resources of MIME type text/html and would probably need to be config'd to sub in the javascript bundles. And we'd have to pick a placeholder with a vanishingly small probability of appearing elsewhere in minified js code (I guess verbose strings have their use- reminding me of a word I recently learned used to describe a book I was reading: logorrhea; feel free to appropriate this for pangenome compression purposes...)

anyway, I will try your idea once I have time and am brave enough to tinker with it again...

adf-ncgr commented 2 years ago

Having tinkered a bit more, I think my original assessment was not correct. It seems that the sub_filter trick can work but depends on setting the location in the nginx config to match the base href, e.g.:

    location /gcv-client/ {
        alias   /usr/share/nginx/html/;
        index  index.html;
        try_files $uri $uri/ /index.html;
        sub_filter '<base href="/">' '<base href="/gcv-client/">';
        sub_filter_once on;
    }

works without the placeholder ruse, although in all configurations I have tried so far, a request without the trailing-slash performs a weird redirection that I am at a loss to explain. It may have something to do with multiple layers of proxying going on in my setup in a VM on fisher. Anyway, there's probably room for some improvement but unless you think the above modification is wrong I can tweak the README to note this and call it good enough for now.

alancleary commented 2 years ago

Yes, if that seems to work then please update the README!

adf-ncgr commented 2 years ago

OK, I've cleared up some confusion about various things I was seeing in tests. Most of the confusion and also the fact that the example default.conf works at all seems to be attributable to the fact that a base href without a trailing slash in the angular context more or less just ignores the last component of the path; see: https://github.com/angular/angular/issues/21003 (although I won't claim to have followed everything said there, and may be oversimplifying; "just ignores" is my assessment from empirical tests)

So, the thing that seems to be most correct is indeed something like what I shared above, ie:

    location /gcv-client/ {
        alias   /usr/share/nginx/html/;
        index  index.html;
        try_files $uri $uri/ /index.html;
        sub_filter '<base href="/">' '<base href="/gcv-client/">';
        sub_filter_once on;
    }

HOWEVER, this is not quite good enough for now as I had previously opined, because if one tries to use this with the bookmarkable URLs generated in the course of using the tool then one gets an error like:

Error: Cannot match any routes. URL Segment: 'client/gcv/gene;lis=phavu.Phvul.010G000200'
    at $e.noMatchError (main-es2017.4f472aba31028665ce0e.js:1)

which I think we would all agree defeats the purpose of having the updated URLs in the first place. I am not sure but I suspect that this may lead us back to the earlier contention that is expressed in the title of the issue, but I will go back and test the PLACEHOLDER idea again using my new-found marginal understanding of some of the inner workings of this infernal machine and see if it leads to a viable workaround... stay tuned!

alancleary commented 2 years ago

The path in the error doesn't seem to match the location in the NGINX config. Was this from you experimenting or is the URL not being correctly set by GCV?

adf-ncgr commented 2 years ago

sorry, my bad for copying the previous config but having since moved to a different base href. "Experimenting".

adf-ncgr commented 2 years ago

thanks for re-visiting this issue. I think what I have ended up doing is not ideal but works. An example:

server {

    listen 80;
    root   /usr/share/nginx/html;

    # client
    location / {
        alias   /usr/share/nginx/html/;
        #index  index.html;
        sub_filter '<base href="/">' '<base href="/tools/gcv/">';
        sub_filter_once on;
        rewrite ^/tools/gcv/(config\/config.json)$ /$1 last;
        rewrite ^/tools/gcv/(.*\.(js|css|png|gif|svg|ttf|eot|woff|woff2))$ /$1 last;
        try_files $uri $uri/ /index.html;
    }

}

my recollection is that you had suggested the sub_filter approach, but that it seemed to be important to have location specified at the root for it to work. The thing that is obviously lame-o about this is the enumeration of static file extensions that need to get served without being treated as paths into the app. Also the fact that the url path gets repeated (I believe I found that nginx conf supports variables, but can't expand them in the context of rewrite rules). I suspect there is a more elegant solution that I was unable to find, probably you or @nathanweeks can show me how it should be done...

nathanweeks commented 2 years ago

@adf-ncgr it's been a long while since I last tried hosting gcv at a non-"/" URL path. Do you plan to relocate the gcv front-end to www.legumeinfo.org/tools/gcv/ ?

adf-ncgr commented 2 years ago

@nathanweeks the example I gave comes from https://medicago.legumeinfo.org/tools/gcv/ and although I don't have specific plans about the main legumeinfo.org hosting pattern, I want to retain the option so that we can potentially use the already-implemented BroadcastChannel features of GCV which require same origin with other tools communicating on the channel (we have done this with GCV and ZZBrowse, for example, although it looks like our same-origin GCV for ZZBrowse is currently AWOL).

alancleary commented 2 years ago

Also, MaizeGDB is working on standing up their own GCV, which is what finally convinced me to fix this. I'm not sure what path they're looking to deploy on but I don't want them to have to recompile GCV just to change the path.

adf-ncgr commented 2 years ago

I should have mentioned that the update I posted does not require recompilation of GCV, as your sub_filter trick worked just fine without it, as long as some other changes to your original solution were made (details now escaping me). It will be exciting to see another big plant database starting to use it!

alancleary commented 2 years ago

@adf-ncgr Was the non-build solution to configure sub_filter to also update JavaScript files? That's what I was going to try next.

adf-ncgr commented 2 years ago

I don't think it required any special handling of javascript files. Is the thing I posted as an example not working for you? I can check again to see if there was any additional tweaking.

alancleary commented 2 years ago

Whoops. For whatever reason I thought the last solution you posted didn't work. I agree that it's not ideal.

Last time we looked into this I came across a Stack Overflow solution that dynamically sets the base href within Angular by reading the path from the window. That solution is finicky, though; I couldn't get it to work reliably with GCV. However, I just revisited the solution and adapted it to read the actual base href tag in index.html (duh!), which makes the Angular app work correctly by just changing the base href tag in the index.html file. This, paired with the original sub_filter solution, appears to have the desired effect!

I can push the changes to a branch if you'd like to test it or I can YOLO it into main. Your call, @adf-ncgr.

adf-ncgr commented 2 years ago

I should probably test it, if only to make sure it works in reverse-proxied contexts such as tend to occur in our current VM-based deployments; but thanks in advance for finding what sounds like the winning approach!

alancleary commented 2 years ago

OK. I just created a base-href branch with a commit implementing the solution I described. Changing the sub_filter url in the NGINX config file should do the trick now.

adf-ncgr commented 2 years ago

I think this is working although it seems that it is important to specify a trailing slash in the url specified by the sub_filter. Without it, it seems that the last component of the specified path gets clipped off in requests made for scripts and such, e.g. if I specify sub_filter '<base href="/">' '<base href="/tools/gcv">'; then although the index page loads subsequent requests for scripts are like: http://dev.lis.ncgr.org:50015/tools/runtime-es2017.45cc648bbf40ffeec4ef.js instead of http://dev.lis.ncgr.org:50015/tools/gcv/runtime-es2017.45cc648bbf40ffeec4ef.js

I have a vague memory that I had discovered this quirk in previous attempts and determined that it is arguably correct behavior according to pundits on the angular github (or someplace similar). Possibly worth a note about this in the documentation, but otherwise I think it is working well. However, I also seem to have discovered that the change you made may be unnecessary, as a tweak I ended up making to my proxy config in order to get the branch to work also seems to be working just fine when applied to main. And actually, I just realized I was never testing the change you made because although I checked out the branch, I was using the compose.prod.yml file which specifies the tagged image. So hard to be a conscientious tester these days!!

For the record, the apache proxying setup that seems to be working with the specification of sub_filter directive as above is:

     ProxyPass /tools/gcv http://adf-gcvmicro:4200/
     ProxyPassReverse /tools/gcv http://adf-gcvmicro:4200/

previously I had been using

     ProxyPass /tools/gcv http://adf-gcvmicro:4200/tools/gcv
     ProxyPassReverse /tools/gcv http://adf-gcvmicro:4200/tools/gcv

which may have necessitated the rewrite directives in my suboptimal solution from earlier...

alancleary commented 2 years ago

Can you clarify which NGINX config file you're using? I can't get it to work with the nginx/default.conf file in the main branch of the GCV repo if I change the path to have more than one level. For example, /tools works but /tools/gcv does not. In this case the trailing forward-slash doesn't make a difference, i.e. /tools/gcv/ also doesn't work.

adf-ncgr commented 2 years ago

Mine looks like this:

server {

    listen 80;
    root   /usr/share/nginx/html;

    location / {
        alias   /usr/share/nginx/html/;
        index  index.html;
        try_files $uri $uri/ /index.html;
        sub_filter '<base href="/">' '<base href="/tools/gcv/">';
        sub_filter_once on;
    }

}

I haven't tried it in a proxyless environment, though; should I?

alancleary commented 2 years ago

OK. Yes, please try it in a proxyless environment. And if that doesn't work then can you try the base-href branch in both the proxy and proxyless environments?

adf-ncgr commented 2 years ago

Hmm, well I think I have verified that it is not working without the proxy, but have been unable to verify that your change does make it work in the proxyless setting because I can't seem to build it; docker-compose -f compose.dev.yml up eventually fails with:

=> ERROR [base 8/9] RUN npm ci                                                                                                    28.1s
------
 > [base 8/9] RUN npm ci:
#12 11.47 npm WARN deprecated clipboard-js@0.3.6: Please migrate to https://github.com/lgarron/clipboard-polyfill
#12 27.93 npm notice
#12 27.93 npm notice New minor version of npm available! 8.11.0 -> 8.12.2
#12 27.93 npm notice Changelog: <https://github.com/npm/cli/releases/tag/v8.12.2>
#12 27.93 npm notice Run `npm install -g npm@8.12.2` to update!
#12 27.93 npm notice
#12 27.93 npm ERR! code 1
#12 27.93 npm ERR! path /gcv/node_modules/grpc-tools
#12 27.93 npm ERR! command failed
#12 27.93 npm ERR! command sh -c node-pre-gyp install
#12 27.93 npm ERR! response status 404 Not Found on https://node-precompiled-binaries.grpc.io/grpc-tools/v1.11.2/linux-arm64.tar.gz
#12 27.93 npm ERR! node-pre-gyp info it worked if it ends with ok
#12 27.93 npm ERR! node-pre-gyp info using node-pre-gyp@1.0.5
#12 27.93 npm ERR! node-pre-gyp info using node@17.9.1 | linux | arm64
#12 27.93 npm ERR! node-pre-gyp info check checked for "/gcv/node_modules/grpc-tools/bin/grpc_tools.node" (not found)
#12 27.93 npm ERR! node-pre-gyp http GET https://node-precompiled-binaries.grpc.io/grpc-tools/v1.11.2/linux-arm64.tar.gz
#12 27.93 npm ERR! node-pre-gyp ERR! install response status 404 Not Found on https://node-precompiled-binaries.grpc.io/grpc-tools/v1.11.2/linux-arm64.tar.gz
#12 27.93 npm ERR! node-pre-gyp ERR! install error
#12 27.93 npm ERR! node-pre-gyp ERR! stack Error: response status 404 Not Found on https://node-precompiled-binaries.grpc.io/grpc-tools/v1.11.2/linux-arm64.tar.gz
#12 27.93 npm ERR! node-pre-gyp ERR! stack     at /gcv/node_modules/@mapbox/node-pre-gyp/lib/install.js:67:15
#12 27.93 npm ERR! node-pre-gyp ERR! stack     at processTicksAndRejections (node:internal/process/task_queues:96:5)
#12 27.93 npm ERR! node-pre-gyp ERR! System Linux 5.10.104-linuxkit
#12 27.93 npm ERR! node-pre-gyp ERR! command "/usr/local/bin/node" "/gcv/node_modules/.bin/node-pre-gyp" "install"
#12 27.93 npm ERR! node-pre-gyp ERR! cwd /gcv/node_modules/grpc-tools
#12 27.93 npm ERR! node-pre-gyp ERR! node -v v17.9.1
#12 27.93 npm ERR! node-pre-gyp ERR! node-pre-gyp -v v1.0.5
#12 27.93 npm ERR! node-pre-gyp ERR! not ok
#12 27.94
#12 27.94 npm ERR! A complete log of this run can be found in:
#12 27.94 npm ERR!     /root/.npm/_logs/2022-06-16T23_26_32_933Z-debug-0.log
------
executor failed running [/bin/sh -c npm ci]: exit code: 1
ERROR: Service 'gcv' failed to build

is this me forgetting how to actually build it or something weird in my set-up (I'll note that I had to get the Docker for Apple Silicon in order to even get this far, as this is the first time I've tried to use docker on the new laptop)

nathanweeks commented 2 years ago

:

12 27.93 npm ERR! response status 404 Not Found on https://node-precompiled-binaries.grpc.io/grpc-tools/v1.11.2/linux-arm64.tar.gz

@adf-ncgr: In my limited experience with Apple silicon, I've noticed a number of similar issues with missing docker base images, architecture-specific packages (Python, conda, and others), compiled-from-golang static binaries, etc.

adf-ncgr commented 2 years ago

lovely, well thanks @nathanweeks for letting me know it's not my senility at play (or at least not solely that). I had actually just come to the conclusion that it was probably not the case since I could build it fine on the Linux VM where the proxied instance was being hosted, but it's good to hear that I should be wary of Docker experience on the new chips, I was afraid there would be issues of this sort...

adf-ncgr commented 2 years ago

PS. having successfully built it on the VM, I can at least verify that Alan's change does work with the proxied approach. If we want to decide that it is likely necessary to work in non-proxied contexts and move on, that is fine with me!

alancleary commented 2 years ago

having successfully built it on the VM, I can at least verify that Alan's change does work with the proxied approach. If we want to decide that it is likely necessary to work in non-proxied contexts and move on, that is fine with me!

Perhaps @nathanweeks can verify it works in a non-proxy context.

@adf-ncgr I'll add your woes to my growing list of reasons why we should ditch gRPC. The working title for the list is "more trouble than worth."

adf-ncgr commented 2 years ago

didn't you (@alancleary) already verify the non-proxy context? "more trouble than worth" sound to me like words to live by, though don't abandon anything just for the sake of my easily-stubbed toes...

adf-ncgr commented 2 years ago

For the record, my build error appears to be linked to this issue with grpc: https://github.com/grpc/grpc-node/issues/1405 which they blame on: https://github.com/actions/virtual-environments/issues/2187 which may be something nearing resolution: https://github.com/github/roadmap/issues/528

I can't say I didn't have some feelings of foreboding about things like this when I heard the new laptop they ordered me was going to have a fancy new apple chip...

adf-ncgr commented 2 years ago

OK, looks like I am able to build on the mac using an option in the docker-compose: platform: linux/amd64 per https://docs.docker.com/desktop/mac/apple-silicon/#known-issues this causes it to use intel-based images running under emulation. Sadly, my newly built GCV using sub_filter '<base href="/">' '<base href="/tools/gcv/">'; doesn't seem to be working as expected. The index page loads, and requests scripts correctly as, e.g. http://localhost:4200/tools/gcv/polyfills.852c448391399db2.js but the response to that request is another copy of the index page. I'm pretty sure I experienced this same behavior when I was trying to make it work previously. And, I guess the reason it is working for me in the proxied setup is that with the change to the proxy config I am actually proxying it to basically not use the subpaths after the handoff.

It's entirely possible I've done something goofy, but let's leave this issue open until we're in agreement about status.

alancleary commented 2 years ago

That sounds similar to the behavior I was experiencing with the solution from Stack Overflow before I adapted it to read the base href tag. Perhaps we should get together and compare notes since mine seems to be working and your is not.

adf-ncgr commented 2 years ago

sounds good, maybe we could do this after (or during) today's stand-up meeting?

alancleary commented 2 years ago

No need to drag everyone else into the weeds with us. @ATintern and I are meeting after the stand-up. We could do it then or after that if that works for you.

alancleary commented 2 years ago

It looks like by previous comment didn't stick:

@adf-ncgr I created an "nginx-filter" branch last week that uses the working NGINX configuration you previously posted, except it it sets the path via the GCV_PATH environment variable. The Dockerfile sets this to gcv/ by default but it can be changed by setting the container's GCV_PATH variable in the compose.build-prod.yml file, i.e. you don't have to mount a custom NGINX config file in the container if you just want to change the path.

Let me know if this works for you and I'll merge into main.

adf-ncgr commented 2 years ago

seems to work for me, although I am a little puzzled by two behaviors:

anyway, I can live with these behaviors- I'll bet we are both a little sick of this issue!

alancleary commented 2 years ago

the GCV_PATH doesn't seem to get used if placed in .env

I didn't actually update the compose.build-prod.yml file to set the variable since I'll be deleting that file in preference for the gcv-docker-compose repository once @ATintern makes his PR. Adding the follow to the compose file should do the trick:

services:
  gcv:
    ...
    environment:
      - GCV_PATH=${GCV_PATH}
    ...

trying to make it so that the user doesn't have to know to put a trailing slash on their path by including such in nginx/templates/default.conf.template didn't seem to have the desired effect; ie: sub_filter '<base href="/">' '<base href="${GCV_PATH}/">'; seemed to behave similarly to sub_filter '<base href="/">' '<base href="${GCV_PATH}">';

I was just going to note the trailing slash requirement in the Docker Compose documentation...

adf-ncgr commented 2 years ago

Ah, I thought putting into .env was equivalent to the environment specification in the file; shows you how much I know... and noting the trailing slash in documentation seems like a better approach than beating our heads against it.

alancleary commented 2 years ago

The branch has been merged! Andrew's rewrite rule was added in commit 0621dd2df38228e484ef1f6190e4a0550a6c2788. Closing.