cs3org / OCM-API

OpenCloudMesh API
39 stars 11 forks source link

Document current translation that happens for webdav #70

Closed michielbdejong closed 1 year ago

michielbdejong commented 1 year ago

I'm a bit embarrassed that I don't know this by heart but I'm just experimentally observing how to access a webdav share from say nc1.docker. I see my nc2.docker make requests like:

PROPFIND https://nc1.docker/remote.php/dav/files/einstein/asdf/qwer/asdf And with a 'requesttoken' => '2oCY+AwkE9BKYBynQflAS+G4RnQcU3yyHgkewQblqNw=:q6v2ikhhSZR+I1aTcL0kGNjgCxUrfBSHcH5Q8ESPza8=' header (I'm still figuring out if this is what carries the sharedSecret and if so, how this is put into what looks like two base64-encoded strings).

I see https://nc1.docker/ocm-provider advertises https://nc1.docker/remote.php/webdav/ as the webdav root so I think that was then redirected to https://nc1.docker/remote.php/dav/

I'll get to the bottom of this so we can add it to the spec.

michielbdejong commented 1 year ago

It happens in https://github.com/nextcloud/server/blob/2eab2ffa22451851601d68bd73e0285a8803990e/apps/files_sharing/lib/External/Storage.php#L88 for instance array ( 'secure' => true, 'host' => 'nc1.docker', 'root' => '/public.php/webdav/', 'user' => 'AHTFwTVBsbQ69vd', 'password' => '', )

So the URL is https://nc1.docker/public.php/webdav/ and the token as the user, with empty password.

The /public.php/webdav/ comes from https://nc1.docker/ocm-provider ["resourceTypes"][0]["protocols"]["webdav"] Or at least that's what I think /should/ happen for OCM, looking at the code I think it's actually querying https://nc1.docker/ocs-provider and then taking ["services"]["FEDERATED_SHARING"]["endpoints"]["webdav"]

michielbdejong commented 1 year ago

From nc2.docker:


curl -i -X PROPFIND https://AHTFwTVBsbQ69vd:@nc1.docker/public.php/webdav/
HTTP/1.1 207 Multi-Status
Date: Mon, 08 May 2023 15:10:11 GMT
Server: Apache/2.4.52 (Ubuntu)
Set-Cookie: oc85jp4ztn2f=06aglmmfudk1qpq8odocon7q8j; path=/; secure; HttpOnly; SameSite=Lax
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate
Pragma: no-cache
Set-Cookie: oc_sessionPassphrase=p%2F%2Bgvd4KcZT4ZY1k9jRjX3VKZhvPindy5Xtzxm5Hb6J10B6ccXxQLrPbR3hNOBr8wvIpPjRNuEttnsTo7ON8A%2FOC%2F%2Fw8qbVHNr9uHx%2BXB5bReEhmCu%2FvQqG8%2BQY2EknZ; path=/; secure; HttpOnly; SameSite=Lax
Set-Cookie: oc85jp4ztn2f=pjppc4pmsme68fdhotlnm0jv6t; path=/; secure; HttpOnly; SameSite=Lax
Set-Cookie: oc85jp4ztn2f=pjppc4pmsme68fdhotlnm0jv6t; path=/; secure; HttpOnly; SameSite=Lax
Content-Security-Policy: default-src 'self'; script-src 'self' 'nonce-S1Azd0x6YVF1TFhBMVYrMWpjRUNIdW5MLytSYVJBbTZOaC9hSTlXNFJWWT06ZjVQQ2ZHVDQrZmlLdVJQQXlvVkdMWjJNMEtVUGZFajllRSswYmY3N05Ebz0='; style-src 'self' 'unsafe-inline'; frame-src *; img-src * data: blob:; font-src 'self' data:; media-src *; connect-src *; object-src 'none'; base-uri 'self';
Referrer-Policy: no-referrer
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Permitted-Cross-Domain-Policies: none
X-Robots-Tag: noindex, nofollow
X-XSS-Protection: 1; mode=block
Set-Cookie: __Host-nc_sameSiteCookielax=true; path=/; httponly;secure; expires=Fri, 31-Dec-2100 23:59:59 GMT; SameSite=lax
Set-Cookie: __Host-nc_sameSiteCookiestrict=true; path=/; httponly;secure; expires=Fri, 31-Dec-2100 23:59:59 GMT; SameSite=strict
Set-Cookie: oc85jp4ztn2f=pjppc4pmsme68fdhotlnm0jv6t; path=/; secure; HttpOnly; SameSite=Lax
Vary: Brief,Prefer
DAV: 1, 3, extended-mkcol, nextcloud-checksum-update
X-Request-Id: ZDN0DNrNYHGhXUqnTors
Content-Length: 569
Content-Type: application/xml; charset=utf-8

<?xml version="1.0"?>
<d:multistatus xmlns:d="DAV:" xmlns:s="http://sabredav.org/ns" xmlns:oc="http://owncloud.org/ns" xmlns:nc="http://nextcloud.org/ns"><d:response><d:href>/public.php/webdav/</d:href><d:propstat><d:prop><d:getlastmodified>Mon, 08 May 2023 14:16:34 GMT</d:getlastmodified><d:resourcetype><d:collection/></d:resourcetype><d:quota-used-bytes>0</d:quota-used-bytes><d:quota-available-bytes>-3</d:quota-available-bytes><d:getetag>&quot;6459044291249&quot;</d:getetag></d:prop><d:status>HTTP/1.1 200 OK</d:status></d:propstat></d:response></d:multistatus>
michielbdejong commented 1 year ago

So it's:

export HOST=nc1.docker
export PATH=`curl -s https://$HOST/ocm-provider/ | jq -r '.resourceTypes[0].protocols.webdav'`
export TOKEN=AHTFwTVBsbQ69vd
curl -i -X PROPFIND https://$TOKEN:@$HOST$PATH
glpatcern commented 1 year ago

Thanks Michiel for this input. A few more things to clarify:

1) This bash-implemented sequence is the "algorithm" used by Nextcloud to access a remote share (right?). Is ownCloud using the same algorithm? Or in other words if Reva serves shares using those paths then we expect to be able to serve shares to the whole ScienceMesh?

2) What about sharing a folder and accessing a file in the subtree of the share? From the PATH above, it seems you get only the top level - and nowhere I see a filename or a file path.

3) To access a Nextcloud server, I see that an access to /ocs-provider might be made, as Nextcloud defined those "OCS" services, but in your last example you did access /ocm-provider. Do you think we can safely ignore the /ocs-provider part?

michielbdejong commented 1 year ago
  1. Yes
  2. Yeah, I think you could then drill down into the share by appending a relative path at the end; let me test that
  3. I think so, but let me test that too to be sure
glpatcern commented 1 year ago

Eventually, we have learned "the hard way" that both OC and NC do query the /ocs-provider endpoint, on top of the /ocm-provider one.

OCS stands for Open Collaboration Services, a standard that predates OCM.

looking at the code I think it's actually querying https://nc1.docker/ocs-provider and then taking ["services"]["FEDERATED_SHARING"]["endpoints"]["webdav"]

An example of a full payload from NC:

{
  "version": 2,
  "services": {
    "PRIVATE_DATA": {
      "version": 1,
      "endpoints": {
        "store": "/ocs/v2.php/privatedata/setattribute",
        "read": "/ocs/v2.php/privatedata/getattribute",
        "delete": "/ocs/v2.php/privatedata/deleteattribute"
      }
    },
    "SHARING": {
      "version": 1,
      "endpoints": {
        "share": "/ocs/v2.php/apps/files_sharing/api/v1/shares"
      }
    },
    "FEDERATED_SHARING": {
      "version": 1,
      "endpoints": {
        "share": "/ocs/v2.php/cloud/shares",
        "webdav": "/public.php/webdav/",
        "shared-secret": "/ocs/v2.php/cloud/shared-secret",
        "system-address-book": "/remote.php/dav/addressbooks/system/system/system",
        "carddav-user": "system"
      }
    },
    "PROVISIONING": {
      "version": 1,
      "endpoints": {
        "user": "/ocs/v2.php/cloud/users",
        "groups": "/ocs/v2.php/cloud/groups",
        "apps": "/ocs/v2.php/cloud/apps"
      }
    }
  }
}
glpatcern commented 1 year ago

Keeping the issue open for now until we have incorporated those findings

glpatcern commented 1 year ago

An update following our latest tests and findings from @mirekys.

It appears that neither OC10 nor NC are respecting the above algorithm, and include bits of hardcoded paths e.g. here for OC and here for NC edit: that is the /ocm-provider of NC, nothing to do with how NC accesses remote shares.

Indeed, a Nextcloud server appears to make the following calls (where /public.php is hardcoded and does not come from the /ocm-provider response):

172.18.0.9 - - [08/Jun/2023:09:36:07 +0000] "GET /ocm-provider/ HTTP/1.1" 200 2815 "-" "Nextcloud Server Crawler"
172.18.0.9 - - [08/Jun/2023:09:36:07 +0000] "POST /index.php/apps/federatedfilesharing/notifications HTTP/1.1" 201 2629 "-" "Nextcloud Server Crawler"
172.18.0.9 - - [08/Jun/2023:09:36:07 +0000] "GET /ocs-provider/ HTTP/1.1" 200 3193 "-" "Nextcloud Server Crawler"
172.18.0.9 - - [08/Jun/2023:09:36:07 +0000] "PROPFIND /public.php/webdav/ HTTP/1.1" 401 2973 "-" "sabre-dav/4.4.0 (http://sabre.io/)"
172.18.0.9 - <sharedSecret> [08/Jun/2023:09:36:07 +0000] "PROPFIND /public.php/webdav/ HTTP/1.1" 207 1829 "-" "sabre-dav/4.4.0 (http://sabre.io/)"
172.18.0.9 - - [08/Jun/2023:09:36:54 +0000] "GET /ocs-provider/ HTTP/1.1" 200 3189 "-" "Nextcloud Server Crawler"

An ownCloud 10 server appears to make the following calls:

172.18.0.12 - - [08/Jun/2023:09:33:16 +0000] "GET /ocm-provider/ HTTP/1.1" 200 3445 "-" "ownCloud Server Crawler"
172.18.0.12 - <sharedSecret> [08/Jun/2023:09:33:17 +0000] "PROPFIND /public.php/webdav/ HTTP/1.1" 207 4086 "-" "sabre-dav/4.4.0 (http://sabre.io/)"
172.18.0.12 - - [08/Jun/2023:09:33:17 +0000] "GET /ocm-provider/ HTTP/1.1" 200 3451 "-" "ownCloud Server Crawler"
172.18.0.12 - <sharedSecret> [08/Jun/2023:09:33:17 +0000] "PROPFIND /public.php/webdav/ HTTP/1.1" 207 4088 "-" "sabre-dav/4.4.0 (http://sabre.io/)"
172.18.0.12 - - [08/Jun/2023:09:33:17 +0000] "GET /ocm-provider/ HTTP/1.1" 200 3453 "-" "ownCloud Server Crawler"
172.18.0.12 - - [08/Jun/2023:09:33:17 +0000] "POST /index.php/ocm/notifications HTTP/1.1" 201 3174 "-" "ownCloud Server Crawler"
mirekys commented 1 year ago

For OC, I've tried to track down webdav endpoint discovery paths here: https://codimd.web.cern.ch/tAkryD79RYiq4QS3-y_Rtw.

That hardcoded webdav protocol paths are basically used in OC/NC's responses to /ocm-provider/ discovery request.

But other way around, when a remote federated share is added, atleast for OC, it tries to query share remote host for webdav endpoint path, and stores the result in ownCloud's in-memory cache (if enabled & configured).

That flow also uses /public.php/webdav as hardcoded default, but that should really be overridden by whatever comes from $remote's /ocm-provider/ response under:

$endpointUrl = response['services']['FEDERATED_SHARING']['endpoints']['webdav'];

(unless 'PHPUNIT_RUN' env variable is set)

Is there anything else specific that we should try to track down and focus on? The whole federatedfilessharing app seems to be a lot to make a sense of

glpatcern commented 1 year ago

that should really be overridden by whatever comes from $remote's /ocm-provider/ response under:

Just to be sure: did you intend /ocs-provider or really /ocm-provider (s vs m)?

In both cases, we tried to expose the path correctly but OC10 used a malformed URL:

138.68.152.182 - - [07/Jun/2023:17:11:25 +0200] "GET /ocm-provider/ HTTP/1.1" 200 481 "-" "ownCloud Server Crawler" "-"
138.68.152.182 - <sharedSecret> [07/Jun/2023:17:11:25 +0200] "PROPFIND /https://sm1.cernbox.cern.ch/remote.php/dav/ocm/ HTTP/2.0" 405 157 "-" "sabre-dav/4.4.0 (http://sabre.io/)" "-"

So at least OC10 is not hardcoding /public.php/webdav (at variance with NC), yet it fails on that /https://....

Is there anything else specific that we should try to track down and focus on? The whole federatedfilessharing app seems to be a lot to make a sense of

I guess @michielbdejong and/or Mahdi are looking at a full reverse engineering of the code path used to access a remote share. Possibly easier with OC10 - we need to understand where that extra / comes from - than NC - where there's more hardcoded stuff and the /index.php/apps/federatedfilesharing/notifications endpoint is also involved.

mirekys commented 1 year ago

@glpatcern

Just to be sure: did you intend /ocs-provider or really /ocm-provider (s vs m)?

I meant it first tries /ocm-provider/, and if nothing is found there, it goes for /ocs-provider see here.

In both cases, we tried to expose the path correctly but OC10 used a malformed URL:

138.68.152.182 - - [07/Jun/2023:17:11:25 +0200] "GET /ocm-provider/ HTTP/1.1" 200 481 "-" "ownCloud Server Crawler" "-"
138.68.152.182 - <sharedSecret> [07/Jun/2023:17:11:25 +0200] "PROPFIND /https://sm1.cernbox.cern.ch/remote.php/dav/ocm/ HTTP/2.0" 405 157 "-" "sabre-dav/4.4.0 (http://sabre.io/)" "-"

So at least OC10 is not hardcoding /public.php/webdav (at variance with NC), yet it fails on that /https://....

I think I can see what is the problem here...it expects webdav protocol field to contain just a path, not a full url. In your case, you should try exposing it like:

{
   "enabled": true,
   "apiVersion": "1.1.0",
   "endPoint": "https://sm1.cernbox.cern.ch/ocm",
   "provider": "CERNBox",
   "resourceTypes": [
      {
         "name": "file",
         "shareTypes": [
            "user"
         ],
         "protocols": {
            "webapp": "/external/sciencemesh",
            "webdav": "/remote.php/dav/ocm"
         }
      }
   ],
   "capabilities": [
      "/invite-accepted"
   ]
}

The additional / comes from here, that code does canonicalize discovered webdav path into absolute path always starting & ending with /.

glpatcern commented 1 year ago

I think I can see what is the problem here...it expects webdav protocol field to contain just a path, not a full url.

That's what we thought too, and I did change /ocm-provider to expose just the path (hacking my nginx in front of Reva). I will change Reva for good, but that remains unexplained.

glpatcern commented 1 year ago

Update: the nginx hack obviously confused OC as now with the patched /ocm-provider and /ocs-provider we do receive proper calls from OC and NC, with no hardcoded public.php path.