nextcloud / server

☁️ Nextcloud server, a safe home for all your data
https://nextcloud.com
GNU Affero General Public License v3.0
27.36k stars 4.06k forks source link

[Bug]: accessing OCM federated shares does not seem to be compliant to OCM specs #39001

Closed glpatcern closed 1 year ago

glpatcern commented 1 year ago

⚠️ This issue respects the following points: ⚠️

Bug description

Hello there,

This is to report our current findings as we're trying to serve OCM federated shares to NC 26. Most likely because of legacy implementations and technical debt, what we see is that NC offers shares to external parties following the spec, but fails to do so when accessing third party shares, as e.g. it pretends the share to be accepted (but in OCM, notifications are optional) and it checks to some extent the /ocs-provider endpoint on top of the expected /ocm-provider one.

A more detailed description is available at https://github.com/cs3org/OCM-API/pull/76#issuecomment-1596852793, and the full debugging sessions with network traces are available at https://github.com/pondersource/nc-sciencemesh/issues/373

To a minimum, it would be great to have some documentation about how the access to an OCM share is implemented, rather than keeping reverse engineering the code. In particular, what is the role of /ocs-provider? And in which case does Nextcloud fall back to querying the remote end for share info by issuing a POST /index.php/apps/files_sharing/shareinfo, which is a Nextcloud-specific endpoint and not an OCM one?

Of course, having an idea when the OCM implementation can be looked at would help: I'm happy to see some recent interest by @provokateurin to look into OCM 1.1 (#38886).

Also, feel free to contact me for further discussions around OCM and its evolution: @schiessle and lately @smesterheide have historically been the NC point of contact, but this does not seem to apply any longer.

cc @karlitschek for identifying relevant contacts within Nextcloud.

Steps to reproduce

  1. Create a federated (OCM) share with OC 10
  2. Trace the HTTP access requests on the OC10 side with Nextcloud Server Crawler as user-agent, and attempt to accept and access the share

Expected behavior

The expected behavior to access an OCM share has been (recently) documented in https://github.com/cs3org/OCM-API#share-access

This flow did not change since OCM was established and v1.0 was tagged by @schiessle.

Installation method

Community Docker image

Nextcloud Server version

26

Operating system

Debian/Ubuntu

PHP engine version

PHP 8.2

Web server

Apache (supported)

Database engine version

MySQL

Is this bug present after an update or on a fresh install?

Fresh Nextcloud Server install

Are you using the Nextcloud Server Encryption module?

None

What user-backends are you using?

Configuration report

{
    "system": {
        "allow_local_remote_servers": true,
        "passwordsalt": "***REMOVED SENSITIVE VALUE***",
        "secret": "***REMOVED SENSITIVE VALUE***",
        "trusted_domains": [
            "localhost",
            "nc1.docker",
            "nc2.docker",
            "cloud.pondersource.org"
        ],
        "datadirectory": "***REMOVED SENSITIVE VALUE***",
        "dbtype": "mysql",
        "version": "26.0.1.1",
        "overwrite.cli.url": "http:\/\/localhost",
        "dbname": "***REMOVED SENSITIVE VALUE***",
        "dbhost": "***REMOVED SENSITIVE VALUE***",
        "dbport": "",
        "dbtableprefix": "oc_",
        "mysql.utf8mb4": true,
        "dbuser": "***REMOVED SENSITIVE VALUE***",
        "dbpassword": "***REMOVED SENSITIVE VALUE***",
        "updater.release.channel": "git",
        "installed": true,
        "instanceid": "***REMOVED SENSITIVE VALUE***"
    }
}

List of activated Apps

Enabled:
  - cloud_federation_api: 1.9.0
  - comments: 1.16.0
  - contactsinteraction: 1.7.0
  - dashboard: 7.6.0
  - dav: 1.25.0
  - federatedfilesharing: 1.16.0
  - federation: 1.16.0
  - files: 1.21.1
  - files_sharing: 1.18.0
  - files_trashbin: 1.16.0
  - files_versions: 1.19.1
  - lookup_server_connector: 1.14.0
  - oauth2: 1.14.0
  - provisioning_api: 1.16.0
  - sciencemesh: 0.4.2
  - settings: 1.8.0
  - sharebymail: 1.16.0
  - systemtags: 1.16.0
  - theming: 2.1.1
  - twofactor_backupcodes: 1.15.0
  - updatenotification: 1.16.0
  - user_status: 1.6.0
  - weather_status: 1.6.0
  - workflowengine: 2.8.0
Disabled:
  - admin_audit: 1.16.0
  - encryption: 2.14.0
  - files_external: 1.18.0
  - testing: 1.16.0
  - user_ldap: 1.16.0

Nextcloud Signing status

N/A

Nextcloud Logs

See description

Additional info

No response

provokateurin commented 1 year ago

Hi, I've been documenting our OCM implementation in https://github.com/nextcloud/server/blob/master/apps/cloud_federation_api/openapi.json (ignore the request parameters, they work in the json body as well). I'm pretty sure it is relatively compliant, but there are definitely some quirks.

As you noticed I already created an issue for implementing the latest OCM version. We deemed the 1.1 version not that relevant right now, but I can surely raise this issue to my manager. I could fix the non-compliant parts of the 1.0 implementation and add the 1.1 implementation on top.

glpatcern commented 1 year ago

Great, thanks for the super fast reaction!

Indeed, for now the interest is to leverage OCM 1.0 in the most compliant possible form. The extra features of v1.1 are a second priority, as we have an implementation in https://github.com/cs3org/reva.

glpatcern commented 1 year ago

Looking more in details, https://github.com/nextcloud/server/blob/master/apps/cloud_federation_api/openapi.json describes the behavior of NC as an OCM "provider" or server, and it's pretty much OK.

The issue here is the behavior of NC as client of a remote OCM share. This is not documented there, and it's where the system was found to be not compliant. @provokateurin any further documentation we could look at?

glpatcern commented 1 year ago

Just to focus, I would need at least a reply to those two questions reported above:

provokateurin commented 1 year ago

I don't have the answers for your questions, but I can investigate the matter after discussion with my manager. I only sent the spec to let you know that I've been working on this and know a bit about it (not much about the history of our implementation though).

glpatcern commented 1 year ago

Hi there, gentle ping on this matter. Also, please get in contact as we are running (with @smesterheide and others) some regular telcos about OCM and its evolution, bi-weekly on Thursdays at 2pm - next is on July 6th. It would be good to have you @provokateurin as well.

provokateurin commented 1 year ago

Hello @glpatcern, thank you so much for outlining these issues and for the research you have put into this. Would you be up for doing some PR's on this? We would very much welcome PR's on this topic and I will review it for you. I noticed you are also a customer and you can also consider to discuss it with your account manager if you want this to be considered for our teams roadmap.

glpatcern commented 1 year ago

Hello @provokateurin, I'm not sure how you determined that we are also a customer. At CERN, we develop and run CERNBox, which in a way is a variant of ownCloud but based on a completely different backend since several years. As such, I don't have the required knowledge of the Nextcloud backend nor can I allocate time to it, to be able to identify and fix the OCM compatibility issues when Nextcloud is client to other EFSSs.

The interest in having OCM federated sharing working - and in a fully compliant way - across different EFSSs arises because of the scientific collaborations we have with other institutions, some of them running Nextcloud. And it turns out that they have an issue when accessing remote shares, except if the remote share is offered by ownCloud 10 (which hints at some remnant compatibility code that makes OC and NC apparently interoperable, despite they're not following the OCM standard).

Once again I invite you and/or @karlitschek to identify a contact in Nextcloud that is interested in following up with the OCM specs, and contact me privately (my email is linked to my GitHub profile, anyway it's lopresti@cern.ch) so I can send you the details of the regular telcos we are running.

schiessle commented 1 year ago

@glpatcern quick answer to the ocm-provider end-point. This was discussed and agreed on when the first version of the API was drafted. I did a quick search in the OCM github tracker and could only find this one:

https://github.com/cs3org/OCM-API/issues/18

Probably the rest of the discussion happened elsewhere (mails, in person,...)

The reasons for this endpoint is that you can't force one URL to work for everyone. For example the URLs from Nextcloud depending on the individual setup contains a "remote.php" or not. If I remember correctly, other solutions which planed to implement OCM back than had similar concerns. That's why we agreed to a discovery end-point, whis is "ocm-providers"

This was tested and confirmed across all implementation during the release phase of OCM 1.0. So for me it looks like it was broken by someone else who removed the end-point and/or the request to it?

EDIT:

I just see that the discovery end-point is also documented here: https://cs3org.github.io/OCM-API/docs.html?branch=develop&repo=OCM-API&user=cs3org#/paths/~1ocm-provider/get

and explained in the README: https://github.com/cs3org/OCM-API#discovery

schiessle commented 1 year ago

I just see that the confusion comes from the "ocs-provider", right? There I have to hand over again to my colleagues from Engineering about the differences, if this end-point is needed for federated shares and if/how they could me merged.

karlitschek commented 1 year ago

So OCM APIs and the ocm-provider should focus only on the federated sharing usecase. We should follow the official spec 100% here to be 100% compatible with other implementations. OCS and ocs-provider are completely different APIs for different usecases that have nothing to do with federated sharing. It would be a bug if OCM calls somehow call or interact with OCS or the other way around. As of now OCS are only available at Nextcloud. We will look into that.

glpatcern commented 1 year ago

@schiessle that's correct, the issue is with /ocs-provider, not the /ocm-provider that now made it to the standard - you found out the latest refinement. It seems that when accessing a remote share, a NC instance also relies on some responses from /ocs-provider at the remote site, which should not happen as @karlitschek pointed out too. The details are at https://github.com/cs3org/OCM-API/pull/76#issuecomment-1596852793

BTW you might have notifications turned off on the OCM-API repo, all conversations happened there and I have tagged you multiple times (lastly in https://github.com/cs3org/OCM-API/pull/71) following the CS3 conference in Barcelona...

AndyScherzinger commented 1 year ago

Thanks for the detailed description @glpatcern - @ArtificialOwl is taking care of fixing the matter.

ArtificialOwl commented 1 year ago

@glpatcern @AndyScherzinger @karlitschek

I had a first look at it, can you confirm the heading on this ?

Is there any tools, or known products using OCM v1.1, to test the API ?

Please note that the support of OCM v1.1 might require to be on both side to have 2 clouds communicating without problems.

glpatcern commented 1 year ago

Of course it's up to you to decide how far to go with OCM v1.1, but the scope of this issue is well within v1.0. The access protocol was documented following existing (Nextcloud's and ownCloud's) v1.0 implementations, and it is supposed to use /ocm-provider only as discovery method (not /ocs-provider nor /status.php).

Then, OCM v1.1 adds optional features and is intended as a backwards compatible version via capabilities (hence the unchanged major version). https://github.com/cs3org/reva is a complete OCM v1.1 implementation (but currently not fully backwards compatible, as it does not implement caps discovery).

A test suite has been developed for basic (1.0) interoperability tests: https://github.com/cs3org/ocm-test-suite

AndyScherzinger commented 1 year ago

@ArtificialOwl for a short reply to your question, the short term goal is to fix the issue reported here to remove the incompatible behavior regarding v1.0, nothing else.

Anything else is not in scope for this issue and would run through our planning process first.

ArtificialOwl commented 1 year ago

https://github.com/nextcloud/server/pull/39574