HumanBrainProject / neuroglancer-scripts

Conversion of neuroimaging data for display in Neuroglancer
MIT License
27 stars 18 forks source link

Apache config #21

Closed manoaman closed 1 year ago

manoaman commented 2 years ago

Hi,

I am trying to set up Apache (Apache/2.4.37 centos) for file serving but somehow I cannot get it to work the .gz rewriting for the precomputed dataset. Chrome Dev Tool is showing 404 without .gz extensions. Could I be missing something here? I am looking at the following example. https://github.com/google/neuroglancer/issues/357

Thank you, -m

https://neuroglancer-scripts.readthedocs.io/en/latest/serving-data.html

# If you get a 403 Forbidden error, try to comment out the Options directives
# below (they may be disallowed by your server's AllowOverride setting).

<IfModule headers_module>
    # Needed to use the data from a Neuroglancer instance served from a
    # different server (see http://enable-cors.org/server_apache.html).
    Header set Access-Control-Allow-Origin "*"
</IfModule>

# Data chunks are stored in sub-directories, in order to avoid having
# directories with millions of entries. Therefore we need to rewrite URLs
# because Neuroglancer expects a flat layout.
Options FollowSymLinks
RewriteEngine On
RewriteRule "^(.*)/([0-9]+-[0-9]+)_([0-9]+-[0-9]+)_([0-9]+-[0-9]+)$" "$1/$2/$3/$4"

# Microsoft filesystems do not support colons in file names, but pre-computed
# meshes use a colon in the URI (e.g. 100:0). As :0 is the most common (only?)
# suffix in use, we will serve a file that has this suffix stripped.
RewriteCond "%{REQUEST_FILENAME}" !-f
RewriteRule "^(.*):0$" "$1"

<IfModule mime_module>
    # Allow serving pre-compressed files, which can save a lot of space for raw
    # chunks, compressed segmentation chunks, and mesh chunks.
    #
    # The AddType directive should in theory be replaced by a "RemoveType .gz"
    # directive, but with that configuration Apache fails to serve the
    # pre-compressed chunks (confirmed with Debian version 2.2.22-13+deb7u6).
    # Fixes welcome.
    Options Multiviews
    AddEncoding x-gzip .gz
    AddType application/octet-stream .gz
</IfModule>
xgui3783 commented 2 years ago

hmm, I just tried it with httpd:2.4 and was able to reproduce the issue (with a number of issues as well.

can you try with the following configuration and let me know if it goes well?

# If you get a 403 Forbidden error, try to comment out the Options directives
# below (they may be disallowed by your server's AllowOverride setting).

<IfModule headers_module>
    # Needed to use the data from a Neuroglancer instance served from a
    # different server (see http://enable-cors.org/server_apache.html).
    Header set Access-Control-Allow-Origin "*"
</IfModule>

# Data chunks are stored in sub-directories, in order to avoid having
# directories with millions of entries. Therefore we need to rewrite URLs
# because Neuroglancer expects a flat layout.

Options +FollowSymLinks
RewriteEngine On
RewriteRule "^(.*)/([0-9]+-[0-9]+)_([0-9]+-[0-9]+)_([0-9]+-[0-9]+)$" "$1/$2/$3/$4.gz"

# Microsoft filesystems do not support colons in file names, but pre-computed
# meshes use a colon in the URI (e.g. 100:0). As :0 is the most common (only?)
# suffix in use, we will serve a file that has this suffix stripped.

RewriteCond "%{REQUEST_FILENAME}" !-f
RewriteRule "^(.*):0$" "$1"

<IfModule mime_module>
    # Allow serving pre-compressed files, which can save a lot of space for raw
    # chunks, compressed segmentation chunks, and mesh chunks.
    #
    # The AddType directive should in theory be replaced by a "RemoveType .gz"
    # directive, but with that configuration Apache fails to serve the
    # pre-compressed chunks (confirmed with Debian version 2.2.22-13+deb7u6).
    # Fixes welcome.
    Options +Multiviews
    AddEncoding x-gzip .gz
    AddType application/octet-stream .gz
</IfModule>
manoaman commented 2 years ago

@xgui3783 A couple of questions. 1) Will this work in either /usr/local/apache2/conf/httpd.conf and .htaccess? 2) Do the precomputed files and accommodating folders need to be owned by www-data or by the Apache2 running user? 3) Can the file permissions be 755 if not owned by www-data or by the Apache2 running user?

The error I am observing now is 403 Forbidden. I think the new rewriting rule fixed 404 error but I am still doubtful if files need to be updated to serve on Apache2 platform. Any thoughts?

[authz_core:error] [pid 8:tid 139996399806208] [client xx.xx.xx.xx:61237] AH01630: client denied by server configuration: /precomputed/test/10000_10000_10000/0-512
xx.xx.xx.xxx - - [08/Dec/2021:17:32:33 +0000] "GET /precomputed/test/10000_10000_10000/0-512_0-512_0-16 HTTP/1.1" 403 199

Thanks,

manoaman commented 2 years ago

Update: 403 error seems to go away after checking from my incognito browsing mode on Chrome. ".gz" file was downloadable. Although, I still see CORS errors on the Neuroglancer side.

xgui3783 commented 2 years ago

I must admit, I am not too expert at httpd (we often use the nginx server to serve our chunks), but I can provide a bit more context:

1) Will this work in either /usr/local/apache2/conf/httpd.conf and .htaccess

I used httpd:2 from docker hub, but with a few modifications to /usr/local/apache2/conf/httpd.conf (via volume mapping ):

Then, I place the above conf (from my previous comment) in a .htaccess in the root directory of the precomputed chunks

The code I am running is:

docker run -dit \
    -p 8888:80 \
    --name httpd \
    -v ${PWD}/data/:/usr/local/apache2/htdocs/ \
    -v ${PWD}/httpd.conf:/usr/local/apache2/conf/httpd.conf \
    httpd:2.4

2) Do the precomputed files and accommodating folders need to be owned by www-data or by the Apache2 running user?

I just checked, because of the mounting of the volume, the folder is owned by 1000:1000 , with permission: drwxrwxr-x

xgui3783 commented 2 years ago

Update: 403 error seems to go away after checking from my incognito browsing mode on Chrome. ".gz" file was downloadable. Although, I still see CORS errors on the Neuroglancer side.

My guess is that you may need to enable headers_module ? (speaking from someone who's not very well versed in httpd, I can only offer limited educated guess)

manoaman commented 2 years ago

Yes, headers_module is enabled and I also tried with AllowOverride All for the docs. Still I'm seeing CORS errors from the Chrome Dev Tool. Very strange.

xgui3783 commented 2 years ago

I am guessing if you do curl -v http://{HOSTNAME}:{PORT}/precomputed/test/10000_10000_10000/0-512_0-512_0-16 you get the chunk, but no Access-Control-Allow-Origin: * as one of the response headers?

manoaman commented 2 years ago

I tried both curl -v http://{HOSTNAME}:{PORT}/precomputed/test/10000_10000_10000/0-512_0-512_0-16 and curl -v http://{HOSTNAME}:{PORT}/precomputed/test/10000_10000_10000/0-512_0-512_0-16.gz.

One without the extension still gives me 403 Forbidden. And with the .gz extension, I get 200 OK.

xgui3783 commented 2 years ago

hmm, is it possible that the rewrite from .htaccess did not take effect...

can you do a tree precomputed/test/10000_10000_10000/ ? assuming it's not too large?

xgui3783 commented 2 years ago

also, it might sound silly, but can you double check you had

RewriteRule "^(.*)/([0-9]+-[0-9]+)_([0-9]+-[0-9]+)_([0-9]+-[0-9]+)$" "$1/$2/$3/$4.gz"

in your .htaccess (note the final .gz, which differ from the original doc)

manoaman commented 2 years ago

Attached is my httpd.conf and tree output. httpd.conf.txt tree.txt

I also tried a different scenario with .htaccess where I copied and placed precomputed dataset in the doc root. That only resulted in 404 not found. _.htaccess.txt

manoaman commented 2 years ago

Yes, I realized the final .gz and made those changes when I tested.

xgui3783 commented 2 years ago

Hmm, did you use neuroglancer-scripts to convert your volume? If you did, did you use the --flat option?

manoaman commented 2 years ago

Nop, the files are generated with CloudVolume and/or Igneous.

manoaman commented 2 years ago

I tried with nginx and it worked beautiful so I have to think that the files are ok to begin with. A sample config attached. I have no idea what is Apache2 preventing me from loading the files. nginx.conf.txt

xgui3783 commented 2 years ago

I see.

The nginx / apache configuration on https://neuroglancer-scripts.readthedocs.io/en/latest/serving-data.html was really meant for chunks converted by neuroglancer-scripts . neuroglancer-scripts specifically puts x, y, z in nested directory (which is in contrast to the flat directory neuroglancer expects). This is done so that for large volumes, we will not have to work with millions of files in a single directory.

As a result, https://neuroglancer-scripts.readthedocs.io/en/latest/serving-data.html is an example of how to serve the static data from the nested directory in a format that neuroglancer understands (a lot of rewrite's)

In your case (or when volumes are converted by neuroglancer-scripts with --flat flag), the configuration would not work (but arguably having a config that does work should be much easier)

Can you try either:

An actionable item in this issue maybe to clarify that the nginx/apache config is suppose to work only with the default layout, and not with flat layerout

manoaman commented 2 years ago

Okay, so the sole httpd.conf solution is still giving me 403 Forbidden error and there is still something missing from I see. However, the good news is that I was able to get .htacccess version working with the following RewriteRule change.

RewriteRule "^(.*)/([0-9]+-[0-9]+)_([0-9]+-[0-9]+)_([0-9]+-[0-9]+)$" "$1/$2_$3_$4.gz"

use this httpd.conf: httpd.conf.htaccess.txt and put this in the parent directory of the precomputed chunks backup.htaccess.txt

I wasn't aware of the flat and non-flat formats. I suppose the current version of the supported format is here.

Thanks for your help @xgui3783 !