Open labkode opened 2 years ago
@dragotin @michaelstingl @TheOneRing I think this is pretty much it, please let me know any comments. Note that we need to define how the spaces will be reflected in the client configuration, I imagine that there will be more details to store rather than just a sync folder pair (like space id, space type, etc .. ) and that needs to be added to the payload request (PAYREQ).
I'm against jsonifying the local settings also the local path could contain sensitive information the user does not want to share with a server and I don't think any decisions should be made based on the local path. Same for the other settings, why are those needed in addition to the targetPath?
@TheOneRing can you propose a request payload and expected response with the required attributes that you think will be needed? Thanks!
I think all we need is something like this:
Request: POST /space-migration?username=gonzalhu
Body:
{
"version": "2.11",
"folders": [
"/",
"/Documents",
"/eos/a/Alice",
"/Shares/"
]
}
Response: 400
{
"error": "Migration failed due to"
}
Response: 200
{
"folders": {
"/": [
{
"space_id": "PERSONAL SPACE ID OF gonzalhu",
"path": "/"
}
],
"/Documents": [
{
"space_id": "PERSONAL SPACE ID OF gonzalhu",
"path": "/Documents"
}
],
"/eos/a/Alice": [
{
"space_id": "PERSONAL SPACE ID OF Alice",
"path": "/"
}
],
"/Share": [
{
"error": "Shares can't be migrated"
}
]
}
}
The client will then map the folder sync pairs to the spaces using the space id and the new relative paths.
@labkode @TheOneRing
@TheOneRing asked me for feedback, so, with the iOS client's internals in mind, here's - in broad strokes - what I think should work to convert an OC10 (spaces-incapable) account to a spaces-backed account:
1) new spaces-migration-status
capability, returning migration state:
unavailable
: the account can't be migrated because it is already spaces-backedforbidden
: the account may not be migratedpossible
: the account can be migrated to become spaces-backedin-progress
: the migration of the account is in progress completed
: the account has been migrated to spaces2) new migration
endpoint to:
/migration/spaces/initiate-migration
/migration/spaces/status
. (in that case, capabilities would not need to be changed/extended - and the existing capability indicating drive support would be the signal for clients to check the migration status for accounts that weren't drives-enabled before)/migration/spaces/map
that it can use to restructure/migrate its local data. The map would map the legacy path to drive-id + path (essentially pretty much what @TheOneRing already suggested, minus the error message):{
"folders": {
"/": [
{
"space_id": "PERSONAL SPACE ID OF gonzalhu",
"path": "/"
}
],
"/Documents": [
{
"space_id": "PERSONAL SPACE ID OF gonzalhu",
"path": "/Documents"
}
],
"/eos/a/Alice": [
{
"space_id": "PERSONAL SPACE ID OF Alice",
"path": "/"
},
"/vanished/share" : [
"error" : "This share could not be migrated and has been removed.",
"removed" : true
]
]
},
}
I omitted the error message when replicating @TheOneRing's example because the endpoint would not take any parameters or configuration. Instead, it would only return the map to use when mapping legacy paths to their migrated driveID + path pairs.
And each client would then leverage that map to translate/migrate its data set and settings.
Regarding shares: the map would also include the legacy root paths of all shares, mapped to the drive ID + path pairs they have been migrated to. Where shares can't be migrated (and would need to be removed), the share's respective root path would appear in the map with an error and an indication that it has been removed.
@felix-schwarz:
We discussed that that the sync client is as dummy as possible and does not perform any complex logic, therefore, we push it to the server and hence we need this API. The sync client needs to send its configuration for the server to understand the relationship between paths and spaces.
We want migration to be progressive, per user, per account. There isn't another way for a production migration to make it transparent. We only allow our users to have one account configured at a time, so migration could be even per user, that depends on your requirements.
Our sync clients query the following endpoint: /cernbox/desktop/ocs/v1.php/cloud/capabilities?format=json
This endpoint is public and does not contain any user-specific behaviour, so I'm against adding capabilities based on username here. The previous approach is needed: using a static non-user dependant capability to trigger the migration logic on the client.
So, I'm pretty much in favour on taking @TheOneRing proposal to make it simply and just having this static capability.
@labkode Thanks for sharing the context and thoughts behind this.
There are still a few things that aren't clear to me, however:
1) The iOS client builds and maintains a database of the whole account, not just specific shares or folders. Assuming it would only send /
as folder path then, would it only get back info on the user's personal space in return - or also for shares located below it?
I.e. would it have to also identify all share roots in the account's folder tree and also send those shares along to get info on them?
2) How does migration work for the 2nd, 3rd, etc. client of the same user? Especially if the client software has a different configuration / structure?
3) If a client sends its configuration to the server to get a mapping table from old path
to new path + drive ID
back, what's the benefit (or technical/server-side requirement/background) that makes this preferable to the server returning the full mapping table for the account - and the client simply picking from it what applies to it?
@felix-schwarz:
We discussed that that the sync client is as dummy as possible and does not perform any complex logic, therefore, we push it to the server and hence we need this API. The sync client needs to send its configuration for the server to understand the relationship between paths and spaces.
We want migration to be progressive, per user, per account. There isn't another way for a production migration to make it transparent. We only allow our users to have one account configured at a time, so migration could be even per user, that depends on your requirements.
Our sync clients query the following endpoint: /cernbox/desktop/ocs/v1.php/cloud/capabilities?format=json
This endpoint is public and does not contain any user-specific behaviour, so I'm against adding capabilities based on username here. The previous approach is needed: using a static non-user dependant capability to trigger the migration logic on the client.
So, I'm pretty much in favour on taking @TheOneRing proposal to make it simply and just having this static capability.
What additional information besides the user name and the old dav url are needed? Why do you need the trusted certificats, whether vfs is used or the window geometry on the server?
I've already spend days (felt like years) trying to figure out how customers managed to break the owncloud.cfg by applying clever deployment tricks. The owncloud.cfg is not to be touched by any external process.
@felix-schwarz
- The iOS client builds and maintains a database of the whole account, not just specific shares or folders. Assuming it would only send
/
as folder path then, would it only get back info on the user's personal space in return - or also for shares located below it?
Shares will be exposed under a new endpoint outside of the current personal folder, i.e they won't be mounted inside a personal home space anymore. The current remote path for default installations is /
and that will map to a personal
space. Once the migration happens for the sync client for remote folders, the sync client can then query the space discovery endpoint to discover other spaces that were not available, like shares.
How does migration work for the 2nd, 3rd, etc. client of the same user? Especially if the client software has a different configuration / structure?
Right. We cannot have a state on the server per user, and having the state per-user-client is even more difficult. I think we need to assume that when the static capability is enabled all the clients will try to perform a migration if their local state has not been yet migrated. Once the sysadmin decides that the migration is over, the capability is retired and sync clients will not need to perform the migration logic anymore. Clients that missed the update will simply stop working (sysadmin will disable old webdav paths for example).
If a client sends its configuration to the server to get a mapping table from old path to new path + drive ID back, what's the benefit (or technical/server-side requirement/background) that makes this preferable to the server returning the full mapping table for the account - and the client simply picking from it what applies to it?
Because the server does not know what remote folders the user is querying.
In our deployment, a user will connect usually to a remote named /home
, but can also connect to a remote named /home/MySubFolder
. The server cannot simply create a map of arbitrary remote sync folder pairs. However, the server can understand the remote folder configured in the client and return the appropriate space id.
@TheOneRing @felix-schwarz any news on this?
Ok, let me try to summarize this, and make it actionable:
The client sends parts of the old configuration to an endpoint on the server side to get knowledge about the space ID and a path component if applicable. If that call succeeds, the client will be able to compute if it can re-use already synced folders by looking up the content of the me/drives/
endpoint and compare the space ID and path.
For the site administrators it is a way to be in control how many migrations happen and if they happened.
The migration step is a one time activity. If the client has once successfully received the information, it does not try to call the migration endpoint again.
There is a capability if the migration endpoint should called at all. Capabilities are not user specific, so this is the general switch for all users.
The client sends a json document to a specific migration endpoint /migration/spaces
of the following format:
{
"version": "3.0.0",
"remotefolders": [
"/",
"/Documents",
"/eos/a/Alice",
"/Shares/"
]
}
Response: 200
{
"folders": {
"/": [
{
"space_id": "PERSONAL SPACE ID OF gonzalhu",
"path": "/"
}
],
"/Documents": [
{
"space_id": "PERSONAL SPACE ID OF gonzalhu",
"path": "/Documents"
}
],
"/eos/a/Alice": [
{
"space_id": "PERSONAL SPACE ID OF Alice",
"path": "/"
}
],
"/Share": [
{
"space_id": "VIRTUAL SHARE SPACE ID of Alice",
"path": "/"
}
]
}
}
In case the client should not yet be migrated, the server responds with 204
(No Content). In that case, the client continues to use the existing configuration.
Note: The local client paths are useless for the migration routine on the server because local paths are completely under the control of the different clients. If one user has two desktop clients f. ex. the local paths can be different on each. The migration needs to work for both, however.
I'd suggest to not migrate the Shares at all to reduce complexity in the client implementations.
If a legacy user syncs the entire cloud with only one sync connection, the flow would look like:
{
"version": "3.0.0",
"remotefolders": [
"/",
]
}
Response: 200
{
"folders": {
"/": [
{
"space_id": "PERSONAL SPACE ID OF gonzalhu",
"path": "/"
}
],
}
}
and the sync client would only migrate the Personal space. In the first sync run, the /Shares
directory would be removed on client side.
The user would be forced to re-sync the shares using the Add-Shares Wizard and place the shares and spaces as desired.
@labkode How would that work with the CERN projects that you have in the legacy system?
@dragotin that is what Hannah proposed and I think it can work. The only part that is missing is that we won't enable this migration for all users at the same time for obvious reasons, so we need to keep control on when the sync client triggers the migration. For that I proposed to have a different status code
204: account not enabled to be migrated, nothing to do
Agreed, that is what @TheOneRing suggested, I just wanted to summarize the facts again so that we're all on the same page. I added your 204 suggestion to my summary above, thanks.
@dragotin any update on the implementation?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.
@fmoc @TheOneRing can you give an update?
@fmoc @TheOneRing can you give an update?
Can be tested with the 3.0-pre-release builds. A branded build was sent to @labkode
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.
This issue describes a possible API to be implemented in the server that the sync client can use to translate existing sync folder pairs to spaces endpoints.
Follow-up of #3528
How to trigger space migration?
The sync client will read the usual capabilities endpoint:
If the
migration>space_migration>enabled
equalstrue
then the following logic is performed.Configuration
Let's take the following Mac OS Desktop Sync client configuration as an example:
The sync client needs to extract the relevant information in a parseable common format, I use JSON as it is wide spreaded. I suggest the sync clients send ALL the information available but redacts or omits secrets. The approach of sending all the configuration information is a safeguard to prevent the case where we miss some field and then we need another version of the sync client to handle it (and another round of desktop sync client updates).
Payload Request (PAYREQ)
This is an example of the desktop client whose config has not been migrated, i.e all the sync folder pairs are on the old format.
Payload Response (PAYRES)
API
204: account not enabled to be migrated, nothing to do 200: configuration for account already migrated, nothing to do. 201: client applies configuration to migrate to spaces
Request
curl -X POST remote.server/space-migration --data-binary @/tmp/request-payload.json
The workflow will be like this for old and new clients to make the migration not breaking existing clients.
Old clients that do not know about new capability
New clients that now to handle the new capability
FAQ
The logs from a user account
gonzalhu
will look like this:Why sending the account name as query parameter?
Why sending the
verify
parameter? The verify is sent as a way to perform a double-commit on the sync client and to differentiate from the200
response without the verify. It also helps the operator to understand what is going on. For example, if the verify is not seen that means that the sync client crashed/quit and the migration couldn't be completed in a safe way. The last thing we want is to leave a sync client broken and having to perform manual investigations on the user computer to fix it.