Open labkode opened 2 years ago
Both CERN and ownCloud decided together the following scheme:
.
)II was returning an absolute path in reva master. This is changed in the edge branch. A stat, like on the filesystem has no knowledge about the path, only the file basename. We know and return the parentID. The full path can be queried with GetPath()
.
Request: Response
res_id = nil res_id = {storage_id, opaque_id_c}
path = /top/a/b/c path = /top/a/b/c
parent_id = {storage_id, opaque_id_b}
# path without a res_id cannot work
eos file info /a/b/c
------------------------------------------------------------------------------------
Request: Response
res_id = {storage_id, opaque_id_c} res_id = {storage_id, opaque_id_c}
path = nil path = /top/a/b/c # edge is returning /c, needs to be fixed
parent_id = {storage_id, opaque_id_b}
# this request is valid, it behaves like a filesystem stat. `/c` is only the basename to the res_id. We cannot expect a path to the parent in case. To get the path, call `GetPath`
eos file info inode:opaque_id_c
------------------------------------------------------------------------------------
Request: Response
res_id = {storage_id, opaque_id_a} res_id = {storage_id, opaque_id_c}
path = ./b/c path = /top/a/b/c # edge is returning c, needs to be fixed
parent_id = {storage_id, opaque_id_b}
# same as above, c is the basename, we cannot expect that the path to the root is returned
path_a = eos file info inode:opaque_id_a
eos file info path_a/b/c
------------------------------------------------------------------------------------
Request: Response
res_id = {storage_id, opaque_id_c} res_id = {storage_id, opaque_id_c}
path = . path = /top/a/b/c # edge is returning c, needs to be fixed
parent_id = {storage_id, opaque_id_b}
# same as above
eos file info inode:opaque_id_c
------------------------------------------------------------------------------------
Request: Response # eosfs will always return HTTP 404 here
res_id = {storage_id, storage_id} res_id = {storage_id, storage_id}
path = nil path = nil # the "name" of the space root node is not exposed
parent_id = nil?
# path is not set when it is the root folder.
--
In the Request, nil or "." or "" paths SHALL be considered synonyms
Firstly, examples A and G show a defective implementation that doesn't honour the CS3APIs. Paths are a first class citizen of the API and are expected to work.
We thought that we had a mutual agreement that the spaces are first class citizens and all paths are relative to a space.
For example, having the tree /a/b/c/d a client that works with paths can query /a, /a/b or/a/b/c.
A client that works with ids and that knows about d will need to query the parent of d, which is c, then the parent of c which is b and then the parent of b which is a. Performance will be a function of tree depth. A workaround you may suggest is to "remember" node information in the client, however this pushes complexity to any client using the API (desktop sync, mobile) and basically anyone willing to integrate with OCIS in an optimal way.
To illustrate this behaviour from UNIX/POSIX environment, we're inside folder /var/tmp/data/log/2022/05/03/proxy.
Let us elaborate on this.
You know the id of /var/tmp/data/log/2022/05/03/proxy
You do a stat() with the ID, the response will give you the parentID and the stat information.
If you need the path, you do a getPath() which will give you the root of the space and eg ./data/log/2022/05/03/proxy
. So in this case /var/tmp/
is the space mountpoint.
Now, after two request you are able to do all kind of operations within the whole depth of this tree. We never encounter cases in where clients need to do the example
cd ..
cd ..
cd ..
cd ..
cd ..
@dragotin Please also add your thoughts, because you were always insisting on that level of indirection with the clients in mind.
I agree what is written in the summary above. That is IMHO a good summary of what spaces are about. It is important to note that - from a clients POV - it is always needed to query the list of available spaces first, to know the space IDs. That is needed because clients should be able to organize the spaces according the abilities of the platform of the client. For example, a desktop client will put the different types of spaces to different places than the web client.
With the spaces, there is no absolute file tree of all files any more by default.
@labkode what is the actual problem that you need to solve? Can we look into it together?
IMO there is no way around making clients smarter aka teaching them how to discover spaces so they can work in with a truly federated storage API. For that the cleanest approach is to always require a relative CS3 Reference in requests. That being said we obviously need backwards compatability:
ocdav
service currently implements the space discovery. This should move to the sdk so other clients can reuse that logic easily and benefit from distributed spaces.pathstorageprovider
that does the space discovery and then acts as a client.We could define levels for CS3 clients to indicate that a client
@dragotin
The problem we're trying to solve is to implement the expected behaviour of spaces on the edge branch for EOS and I have to admit that we're struggling . We find contradicting behaviours and the only source of knowledge up to date looks like is the codebase, where we have to reverse-engineer the ocisfs implementation and navigate through the code.
Would be possible to express the semantics of how spaces should behave on the CS3APIs documentation? Currently the information is spread across CS3APIS, codebase, GH issues and ADRs.
@micbar Thank you for the explanation, that is indeed better than our initial assumptions, however we still have some doubts.
The first one is the mount-point, there is no mention of that in the CS3AP[1], is the mount point the "name" field in the space info?
The second one is the GetPath(SpaceID) operation, you can get the same path depending on the input you give, i.e, I can have /photos folder under space A and also in space B, so purely on the response you don't know where this path belongs to and is left to the client to link this path to the input provided.
The third one is how can you go from a leaf node to the space root? How is this implemented in ocisfs, do you traverse all the parents until reaching a root node (means not having a parent)?
The more I think about it the more I think that having a path
field in responses is misleading for Spaces.
What do we break if we change the path
field to basename
and we enforce that it doesn't contain slashes (/)?
Is there any codebase that uses path internally? virtual share folders?
A path would still be needed in requests to specify a relative path to the root space resource id, but not needed at all in the responses.
[1] https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.StorageSpace
@labkode thanks for the answer!
The problem we're trying to solve is to implement the expected behaviour of spaces on the edge branch for EOS and I have to admit that we're struggling . We find contradicting behaviours and the only source of knowledge up to date looks like is the codebase, where we have to reverse-engineer the ocisfs implementation and navigate through the code.
That is unfortunate, I offer any help possible. After the first beta next week, our resources are not so streched anymore.
Would be possible to express the semantics of how spaces should behave on the CS3APIs documentation? Currently the information is spread across CS3APIS, codebase, GH issues and ADRs.
Yes! But specially after this discussion in this ticket i have more and more the feeling that we are already 90% alingned and need to overcome the last 10% to get cernbox running and performant on the edge branch. That is still one of our most important missions.
To make things better, we will provide a proposal for the cs3 api changes in form of a PR next week.
Another project from @wkloucek and me is the https://github.com/owncloud/cs3api-validator/ project which is now also available as a docker image. We have it running on reva and ocis CI already. API spec, documentation and this testSuite can provide a good start for a better developer experiance.
The more I think about it the more I think that having a path field in responses is misleading for Spaces. What do we break if we change the path field to basename and we enforce that it doesn't contain slashes (/)? Is there any codebase that uses path internally? virtual share folders? A path would still be needed in requests to specify a relative path to the root space resource id, but not needed at all in the responses.
Yes let us follow up on that. It is already on our agenda during the beta phase.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.
Needs documentation and tests about final outcome.
This writeup comes from a deep investigation performed by @ishank011 @gmgigi96 @glpatcern as part of our activity to run OCIS edge branch at CERN.
Examples on edge branch running ocisfs
4c510ada-c86b-4815-8820-42cdf82c3d51
-> / is the home space ID254d0b60-20e4-4340-9eae-6fa9103ae7d7
-> /app-try7321538e-15da-4352-8dd7-d59b3319e7ef
-> /app-try/app-new-try.txt"id": { "opaqueId": "4c510ada-c86b-4815-8820-42cdf82c3d51", "storageId": "4c510ada-c86b-4815-8820-42cdf82c3d51" }, "path": "/",
cs3.gateway.v1beta1.GatewayAPI@localhost:19000> call Stat