Open ghost opened 2 years ago
Hitting the same issue. Uploads into Galaxy work perfectly fine as a stand-alone system without Apache, but as soon as I configure my Galaxy instance to run using Apache, my Galaxy refuses to upload files, despite having "tusd: enable: false" in my $GALAXY_ROOT/config/galaxy.yml
and $GALAXY_ROOT/database/gravity/configstate.yaml
So, we're shipping Galaxy with a tus middleware. That works without any configuration and it cannot be disabled, this is the only way for uploading files via the frontend. You can enable an external tus server to speed this up, that is what you can set up in the gravity section of galaxy.yml.
I gave this a try and the missing piece is the ProxyPreserveHost
directive that needs to be set to on
. https://github.com/mvdbeek/galaxy_doc_examples/commit/922a8b0ddd27e3ace2cb3da3638b8b16bda47484 is how I did this for the sample config, and I'll go ahead and change the docs as well.
OK, so to confirm I understand, tus middleware is deployed automatically with Galaxy. And unless you're using an external tus client, you don't need to make any changes to the tusd
settings in $GALAXY_ROOT/config/galaxy.yml
.
Unfortunately, I've implemented the ProxyPreserveHost on
line in my Apache config file and it doesn't seem to make a difference. I have included by Apache config file (galaxy.conf.txt) as well. But I'm struggling to understand what changes in Galaxy when using Apache that cause uploader and gunicorn to stop working.
The apache errors I'm getting are:
[Thu Apr 07 12:35:12.155723 2022] [proxy:error] [pid 4657] (111)Connection refused: AH00957: HTTP: attempt to connect to 127.0.0.1:4001 (127.0.0.1) failed
[Thu Apr 07 12:35:12.155769 2022] [proxy:error] [pid 4657] AH00959: ap_proxy_connect_backend disabling worker for (127.0.0.1) for 60s
[Thu Apr 07 12:35:12.155778 2022] [proxy_http:error] [pid 4657] [client 192.168.56.1:65004] AH01114: HTTP: failed to make connection to backend: 127.0.0.1
On that, I was having a look in $GALAXY_ROOT/scripts/resumable_upload.py, and lines 6 and 7 are from tusclient import client
and from tusclient.storage import filestorage
respectively. However, I can't see tusclient
in the list of Python modules installed in $GALAXY_ROOT/.venv
. In fact, I can't see tusclient
referenced anywhere in the Galaxy code base except in resumable_upload.py
and within the test modules. What library/module is this calling?
OK, so to confirm I understand, tus middleware is deployed automatically with Galaxy.
Yes
And unless you're using an external tus client, you don't need to make any changes to the
tusd
settings in$GALAXY_ROOT/config/galaxy.yml
.
I guess you meant and external tus server ? That would be the only reason to change something there. You can use any tus client you like.
On that, I was having a look in $GALAXY_ROOT/scripts/resumable_upload.py, and lines 6 and 7 are
from tusclient import client
andfrom tusclient.storage import filestorage
respectively. However, I can't seetusclient
in the list of Python modules installed in$GALAXY_ROOT/.venv
. In fact, I can't seetusclient
referenced anywhere in the Galaxy code base except inresumable_upload.py
and within the test modules. What library/module is this calling?
This is an example for another tus client, focused on developers. Doesn't make sense to include tusclient in the dependencies for this.
But I'm struggling to understand what changes in Galaxy when using Apache that cause uploader and gunicorn to stop working.
If you're not seeing Galaxy at all you have a different problem. Is your Galaxy instance being served at port 4001 via the http protocol ? Note that the documentation builds (https://docs.galaxyproject.org/en/master/admin/apache.html#prerequisites) on the setup in scaling and load-balancing (https://docs.galaxyproject.org/en/master/admin/scaling.html#listening-and-proxy-options).
No, I am seeing Galaxy. Using the galaxy.conf above, I can connect to galaxy on my VM (192.168.56.105) from my laptop (192.168.56.1), paste data and process jobs, so it's not the connecting to Galaxy itself that is the issue. Gunicorn is also listening on port 4001 and I can telnet using port 4001 on the VM without an issue.
The upload problem doesn't happen when I serve Galaxy directly. But once Apache is enabled, when I try to upload data, I get the error message both @hnhnyigh and I reported above:).
I have also tried bind: 0.0.0.0:4001
in case it was something odd relating to the VM localhost/laptop, but it doesn't seem to make a difference.
Everything else seems to serve fine using Apache and Gunicorn when binding to port 4001, except uploading files, which is kind of a dealbreaker :/
Can you check the JavaScript console of your browser for error messages ? The UI error in your screenshot that you're showing there is what the apache directive fixes. You've restarted apache after adding the directive? The apache logs seem to indicate that nothing is running on 127.0.0.1:4001, but then you wouldn't see the interface at all, so that is odd.
I guess the other thing to check is if you've enabled all the necessary modules and that apache is new enough (the docs say 2.3.3 for the ProxyPreserveHost directive). If I follow
https://github.com/tus/tusd/blob/master/examples/apache2.conf#L1-L4 you can do this with sudo a2enmod ssl headers proxy proxy_http
. ssl is probably not needed if you're not serving https.
The Apache version is 2.4.6. All required modules are present and we are serving HTTPS via custom domain (e.g. https://galaxy.example.com).
# httpd -v
Server version: Apache/2.4.6 (CentOS)
Server built: Aug 8 2019 11:41:18
# httpd -M | egrep "ssl_module|headers_module|proxy_module|proxy_http_module"
headers_module (shared)
proxy_module (shared)
proxy_http_module (shared)
ssl_module (shared)
The MS Edge dev console responds with:
HEAD http://localhost:4001/api/upload/resumable_upload/28766ff3091343ae9ba1237e10f58dee net::ERR_CONNECTION_REFUSED
Google Chrome:
Mixed Content: The page at 'https://galaxy.example.com/' was loaded over HTTPS, but requested an insecure XMLHttpRequest endpoint 'http://galaxy.example.com/api/upload/resumable_upload/134b49837a6a481d8edc02ede325268a'. This request has been blocked; the content must be served over HTTPS.
Could the docs please be updated to include advice on handling this scenario in Apache?
They have been updated, you can follow the docs verbatim and you'll have a working configuration. The doc example is worked out completely in mvdbeek/galaxy_doc_examples, that example is serving https as well. If you use
gravity:
app_server: gunicorn
gunicorn:
bind: 'unix:/tmp/gunicorn.sock'
extra_args: '--forwarded-allow-ips="*"'
in your gravity config you can also try https://github.com/mvdbeek/galaxy_doc_examples/blob/main/apache/run_apache_proxy.sh, which I've used to write the docs (you'll need socat on path due to some quirks with OSX, should have no impact if you're on a linux machine).
Can you maybe post your apache config?
Here's a WIP how you can set this up with an external tus server: https://github.com/mvdbeek/galaxy_doc_examples/compare/tusd_setup?expand=1
Not loving the extra base path change we have to make, that is much cleaner in nginx, there's probably a better way with a RewriteRule.
Added '--forwarded-allow-ips="*"'
and switched to unix socket and got a slightly different error:
Failed because: Error: tus: unexpected response while creating upload, originated from request (method: POST, url: /api/upload/resumable_upload/, response code: 400, response text: , request id: n/a)
Here is the apache config:
SSLCipherSuite ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA
SSLProtocol -all +TLSv1.2 +TLSv1.1
SSLHonorCipherOrder On
SSLSessionCache dbm:/usr/share/httpd/ssl/ssl_cache_shm
SSLSessionCacheTimeout 600
SSLRandomSeed startup file:/dev/urandom 2048
SSLRandomSeed connect file:/dev/urandom 2048
SSLVerifyClient none
SSLProxyEngine off
SSLCompression off
Header always set Strict-Transport-Security "max-age=63072000; includeSubDomains; preload"
Header always set X-Frame-Options SAMEORIGIN
Header always set X-Content-Type-Options nosniff
TraceEnable Off
<VirtualHost *:80>
ServerName galaxy.example.org
Redirect permanent / https://galaxy.example.org/
</VirtualHost>
<VirtualHost *:443>
ServerName galaxy.example.org
SSLEngine on
SSLCertificateFile /etc/httpd/ssl/org/org.cer
SSLCertificateKeyFile /etc/httpd/ssl/org/org.nopass.key
SSLCertificateChainFile /etc/httpd/ssl/org/CACertificate.chain.cer
# Custom error and access log
ErrorLog /var/log/httpd/galaxy.error.log
CustomLog /var/log/httpd/galaxy.access.log combined
# use a variable for convenience
Define galaxy_root /srv/galaxy
# don't decode encoded slashes in path info
AllowEncodedSlashes NoDecode
# enable compression on all relevant types
AddOutputFilterByType DEFLATE text/html text/plain text/xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE application/x-javascript application/javascript application/ecmascript
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE application/json
# allow access to static content
<Directory "${galaxy_root}/static">
AllowOverride None
Require all granted
</Directory>
# Galaxy needs to know that this is https for generating URLs
RequestHeader set "X-Forwarded-Proto" "%{REQUEST_SCHEME}e"
# allow up to 3 minutes for Galaxy to respond to slow requests before timing out
ProxyTimeout 180
# proxy all requests not matching other locations to Gunicorn
ProxyPass / "unix:${galaxy_root}/gunicorn.sock|http://localhost/"
ProxyPreserveHost on
# serve framework static content
RewriteEngine On
RewriteRule ^/static/(.*) ${galaxy_root}/static/$1 [L]
RewriteRule ^/favicon.ico ${galaxy_root}/static/favicon.ico [L]
RewriteRule ^/robots.txt ${galaxy_root}/static/robots.txt [L]
# enable caching on static content
<Location "/static">
ExpiresActive On
ExpiresDefault "access plus 24 hours"
</Location>
# enable apache controlled data transfers
<Location "/">
XSendFile on
XSendFilePath /
</Location>
# serve visualization and interactive environment plugin static content
<Directory "${galaxy_root}/config/plugins/(.+)/(.+)/static">
AllowOverride None
Require all granted
</Directory>
RewriteRule ^/plugins/(.+)/(.+)/static/(.*)$ ${galaxy_root}/config/plugins/$1/$2/static/$3 [L]
</VirtualHost>
Here is the gravity config:
gravity:
app_server: gunicorn
gunicorn:
bind: 'unix:/srv/galaxy/gunicorn.sock'
extra_args: '--forwarded-allow-ips="*"'
Hi, I got the same issue as hnhnyigh reported. The galaxy version I used is v22.01, apache 2.4.6, and centOS 7.9. The apache configuration was identical to the doc from Marius (modified galaxy root and galaxy web address). The error message from javascript was identical to the post from hnhnyigh's post,
"Mixed Content: The page at 'https://galaxy.example.com/' was loaded over HTTPS, but requested an insecure XMLHttpRequest endpoint 'http://galaxy.example.com/api/upload/resumable_upload/134b49837a6a481d8edc02ede325268a'. This request has been blocked; the content must be served over HTTPS."
Upload file from "Paste/Fetch data" worked well. Loading application through shed tools and data manager is terrific convenience compared to the old versions of galaxy. Any advise how we can debug the upload from "Choose local file" issue further? Thanks!
@wyan999 I ended up working around the issue by switching to Nginx. See below:
# HTTP Server redirect to HTTPS
server {
listen 80;
server_name galaxy.example.org;
return 301 https://$host;
}
# HTTPS Server
server {
listen 443 ssl http2;
server_name galaxy.example.org;
# use a variable for convenience
set $galaxy_root /srv/galaxy;
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
# compress responses whenever possible
gzip on;
gzip_http_version 1.1;
gzip_vary on;
gzip_comp_level 6;
gzip_proxied any;
gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
gzip_buffers 16 8k;
# allow up to 3 minutes for Galaxy to respond to slow requests before timing out
proxy_read_timeout 180;
# maximum file upload size
client_max_body_size 10g;
# allowable SSL protocols
ssl_protocols TLSv1.2 TLSv1.3;
# use secure ciphers
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384:!aNULL:!eNULL:!EXPORT:!CAMELLIA:!DES:!MD5:!PSK:!RC4;
ssl_dhparam /etc/ssl/nginx/dhparam.pem;
ssl_ecdh_curve secp384r1;
ssl_prefer_server_ciphers on;
# enable session reuse
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 4h;
ssl_session_tickets on;
# cert/key
ssl_certificate /etc/ssl/org.cer;
ssl_certificate_key /etc/ssl/org.nopass.key;
# OCSP stapling
ssl_stapling on;
ssl_stapling_verify on;
#ssl_trusted_certificate /etc/nginx/ssl/ca.crt;
# Enable HSTS
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload" always;
add_header X-Frame-Options "SAMEORIGIN";
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";
# proxy all requests not matching other locations to Gunicorn
location / {
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
#proxy_pass http://unix:$galaxy_root/gunicorn.sock;
proxy_pass http://127.0.0.1:4001/;
}
# serve framework static content
location /static {
alias $galaxy_root/static;
expires 24h;
}
location /robots.txt {
alias $galaxy_root/static/robots.txt;
expires 24h;
}
location /favicon.ico {
alias $galaxy_root/static/favicon.ico;
expires 24h;
}
# serve visualization and interactive environment plugin static content
location ~ ^/plugins/(?<plug_type>.+?)/(?<vis_name>.+?)/static/(?<static_file>.*?)$ {
alias $galaxy_root/config/plugins/$plug_type/$vis_name/static/$static_file;
expires 24;
}
}
So what is official line here? Are we expected to setup external tus now? Apache being a proxy is extremely common I would think...
Well, it's all working for me (nginx and apache, the built-in middleware and the external tus server), so at this point I really don't know where our configs are diverging. I even created https://github.com/mvdbeek/galaxy_doc_examples/ with an example apache config and a script you can just plug in, you only need to provide the gunicorn socket at /tmp/gunicorn.sock or adjust the script. https://github.com/galaxyproject/galaxy/pull/13860 is probably a good idea and maybe that's the missing piece ?
The recommended thing of course is to not use the tus middleware and run one of the tus servers, for large, simultaneous file uploads this is much more resources efficient. For apache the changed config would be
ProxyPreserveHost on
# proxy tus requests to external tus server
# only enable this if you have configured an external server like tusd
ProxyPass "/api/upload/resumable_upload" "http://localhost:1080/api/upload/resumable_upload"
ProxyPassReverse "/api/upload/resumable_upload" "http://localhost:1080/api/upload/resumable_upload
All that said, there's much more experience with nginx on the galaxy side, there are extensions like mod-zip we're using (https://docs.galaxyproject.org/en/master/admin/nginx.html#creating-archives-with-mod-zip), all the public instances run nginx, so if nginx is an option that's probably a good idea.
@hnhnyigh Thank you for the input! I finally got it work after upgrading apache. The proxy module in Apache 2.4.6 may not work well, though the patches related with the proxy were applied and updated. If you want to switch back to the Tus came with the galaxy, you can try to upgrade the apache to its newest version and apply the apache configuration from @mvdbeek, the upload function through local file should work. No need to change gunicron bind to socket (though it will work too). @mvdbeek Thank you for all your helps!
@wyan999: Which version of Apache did you upgrade to? We've had the tusd
issues present in Apache 2.4.6-90.el7.centos
on CentOS 7.7.1908
and Apache 2.4.6-97.el7.centos.5
on CentOS 7.9.2009
.
And what version of the proxy modules did you install/upgrade?
Thanks @wyan999 :)
@quacksawbones I used apache 2.4.53. Apache 2.4.6-90 contains proxy patches, but it did not work well as the proxy module came with the newer version.
Uploading data from disk results in the following error:
Galaxy Version: 22.01 Commit: d4001d18511fa9aba30da88fce5ee98f1b1a11bb OS: CentOS 7.7.1908
To Reproduce Steps to reproduce the behavior:
Expected behavior Uploads work without tus.
Additional context Copying tusd binary into .venv/bin and configuring tusd + Apache does not resolve the issue. In fact, configstate.yaml does not even reflect the contents of the updated Galaxy config (altering configstate.yaml makes no difference either). The tusd service is disabled in the Galaxy config by default but Galaxy's behavior does not reflect this.