jupyter-on-openshift / jupyter-notebooks

OpenShift compatible S2I builder for basic notebook images.
Apache License 2.0
54 stars 111 forks source link

WebDAV does not seem to work #9

Closed juliusvonkohout closed 5 years ago

juliusvonkohout commented 5 years ago

I do not know how to acees webdav, and there is no documentation.

sh-4.2$ curl --digest http://jupyter:julius@localhost:8080/webdav/
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /webdav/ was not found on this server.</p>
</body></html>
sh-4.2$

sh-4.2$ curl --digest http://jupyter:julius@localhost:8081
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL / was not found on this server.</p>
</body></html>
sh-4.2$
GrahamDumpleton commented 5 years ago

How are you deploying the notebook instance? Yourself with a template of you own, using JupyterHub, using the workspace template in this repo? Also what version of the image (github tag) are you using, the very latest?

GrahamDumpleton commented 5 years ago

BTW, even when things are working correctly, you will see this. That is because WebDAV for plain GET request doesn't support generating a plain directory index when the target URL maps to a directory. You would need to request a file which actually exists in the target.

It is better to use an actual WebDAV client initially when testing as it will use other means to provide a directory listing etc.

juliusvonkohout commented 5 years ago

Yes it is working by using cadaver as client. But that is missing in the documentation. i have created a webdav server myself which does create directory lisings and works with most clients. we should at least write in the documentation how to access webdav. e.g.

[julius@localhost ~]$ cadaver https://.../webdav
WARNING: Untrusted server certificate presented for ...
This connection could have been intercepted.
...
Do you wish to accept the certificate? (y/n) y
Authentication required for jupyter-on-openshift/jupyter-notebooks on server `...':
Username: jupyter
Password: 
dav:/webdav/> ls
Listing collection `/webdav/': succeeded.
Coll:   .autovizwidget                         0  Mai 16 21:36
Coll:   .cache                                 0  Mai 18 03:11
Coll:   .config                                0  Mai 18 01:42
Coll:   .ipynb_checkpoints                     0  Mai 18 01:30
Coll:   .ipython                               0  Mai 16 21:05
Coll:   .jupyter                               0  Mai 18 03:02
...
dav:/webdav/> mkdir test
Creating `test': succeeded.
dav:/webdav/> rm test
Deleting `test': redirect to http://.....:8081/webdav/test/
juliusvonkohout commented 5 years ago

I have developed a webdav server that we have in production now. In minimal-notebook/httpd-webdav.conf you could replace

<Location ${WEBDAV_PREFIX}/webdav/>
    DAV on

    AuthType Digest
    AuthName ${WEBDAV_REALM}
    AuthDigestDomain ${WEBDAV_PREFIX}/webdav/
    AuthDigestProvider file
    AuthUserFile ${WEBDAV_USERFILE}

    Require valid-user
</Location>

with

<Location ${WEBDAV_PREFIX}/webdav/>
    DAV on
    DAVDepthInfinity On
    Options +Indexes
    IndexOptions FancyIndexing NameWidth=* Charset=UTF-8
    #workaround for buggy webdav implementations https://gitlab.gnome.org/GNOME/gvfs/issues/339 
    DirectorySlash Off

    AuthType Digest
    AuthName ${WEBDAV_REALM}
    AuthDigestDomain ${WEBDAV_PREFIX}/webdav/
    AuthDigestProvider file
    AuthUserFile ${WEBDAV_USERFILE}

    Require valid-user

</Location>

# DirectoryIndex: sets the file that Apache will serve if a directory is requested.
<IfModule dir_module>
    DirectoryIndex
</IfModule>
EnableSendfile on
GrahamDumpleton commented 5 years ago

Enabling sendfile() support is problematic if there is no way to disable it. This is because sendfile() doesn't always work properly, especially on a network mounted file systems (as would have via a PVC). Apache httpd therefore switched from having it enabled by default to having it disabled, and recommends only enabling it if it is known the file system support and platform means it works.

The ability to browse directories is only useful where accessing the WebDav server read only via a normal web browser. It isn't needed for a proper WebDav client to work.

Support for both could be added, but believe they should be optionally enabled.

juliusvonkohout commented 5 years ago

I reccomend to give read only acces via webbrowser by default, such that you easily see that it is working as a user by acessing notebook/webdav. Furthermore i would add in the documentation that you can use cadaver on the same url.

Furhtermore im curious whether there is any chance that this will work via jupyterhub oauth?

This is the configuration for webbrowser access.

<IfModule !dav_module>
LoadModule dav_module '${MOD_WSGI_MODULES_DIRECTORY}/mod_dav.so'
</IfModule>

<IfModule !dav_fs_module>
LoadModule dav_fs_module '${MOD_WSGI_MODULES_DIRECTORY}/mod_dav_fs.so'
</IfModule>

<IfModule !auth_digest_module>
LoadModule auth_digest_module '${MOD_WSGI_MODULES_DIRECTORY}/mod_auth_digest.so'
</IfModule>

<IfModule !authn_file_module>
LoadModule authn_file_module '${MOD_WSGI_MODULES_DIRECTORY}/mod_authn_file.so'
</IfModule>

<IfModule !authz_user_module>
LoadModule authz_user_module '${MOD_WSGI_MODULES_DIRECTORY}/mod_authz_user.so'
</IfModule>

<IfModule !autoindex_module>
LoadModule autoindex_module '${MOD_WSGI_MODULES_DIRECTORY}/mod_autoindex.so'
</IfModule>

AddDefaultCharset utf-8

DavLockDB /opt/app-root/DavLock

Alias /webdav/ /opt/app-root/src/

<Location /webdav/>
    DAV on
    DAVDepthInfinity On
    Options +Indexes
    IndexOptions FancyIndexing NameWidth=* Charset=UTF-8
    #workaround for buggy webdav implementations https://gitlab.gnome.org/GNOME/gvfs/issues/339
    DirectorySlash Off

    AuthType Digest
    AuthName ${WEBDAV_REALM}
    AuthDigestDomain /webdav/
    AuthDigestProvider file
    AuthUserFile ${WEBDAV_USERFILE}

    Require valid-user
</Location>

<IfModule dir_module>
    DirectoryIndex
</IfModule>
GrahamDumpleton commented 5 years ago

I presume you mean recommend directory browsing by default. It doesn't make sense to say give read only access via web browser by default because a normal HTTP web browser only ever has read only access. To have write access requires the client to understand WebDav protocol extensions over HTTP.

Anyway, am leaning towards having directory browsing on. As general best security practice you wouldn't, but after thinking about it some more, in this particular usage I can't see any harm in enabling it.

GrahamDumpleton commented 5 years ago

I have enabled the directory browsing. I have not added a switch to enable sendfile() at this point as I need to add an easier way to enable to that mod_wsgi-express which is being used to spin up Apache for WebDAV, as that flag should be able to be enabled for WSGI applications as well. I have added sendfile() flag to separate TODO list.

The version of the image with WebDAV and directory browsing is 2.2.1. You can find WebDAV access covered in the workshop on deploying a Jupyter Notebook workspace to OpenShift at:

I am going to close this issue at this point, as believe main issues addressed.