WICG / file-system-access

Expose the file system on the user’s device, so Web apps can interoperate with the user’s native applications.
https://wicg.github.io/file-system-access/
Other
654 stars 65 forks source link

Add cloud file handling proposal #411

Closed alex292 closed 1 year ago

alex292 commented 1 year ago

This proposal adds a a new getCloudHandles() method to FileSystemHandle, which allows to retrieve cloud handles for a file/directory. A cloud handle consists of a vendor identifiert (e.g. "drive.google.com") and a file identifier. With these, the web app can talk to the cloud storage provider through its APIs directly to retrieve/modify the file. This is useful for web apps to figure out if a file is already backed by cloud storage and allows for easier transfer across machines as just the identifier instead of the entire file contents need to be transfered.

josephrocca commented 1 year ago

Prior discussions that are related:

jaime-rivas commented 1 year ago

Hi @alex292,

I'm wondering if you'd be open to extend the proposal to include the non-goals as goals, specifically provide a standardized way for CSP’s web APIs to interact with remote files.

There is another issue opened here #358 and in Chromium where we briefly talk about those goals. It seems like a great opportunity to finally integrate cloud storage into browsers to give standardized APIs to webapps to interact with remote files and remove dependency from the underlying operating system. Thanks in advance!

alex292 commented 1 year ago

That is currently not part of our considerations as that would not solve any of the listed use cases, so the answer is sadly "no". I would also argue that unifying the CSP APIs should not be done via a web API in a browser. The same way you would need to include some form of SDK from each specific CSP vendor to interact with their APIs, there might be one SDK that solves this use case by combining these into a common interface. The browser would not be involved in any of this and therefore it should also not be a web API IMHO.

josephrocca commented 1 year ago

I guess the ideal user experience here might be that the OS/user-agent stores details of the user's cloud storage accounts, and then simply presents those folders as options in the showOpenFilePicker dialogue.

That way it's completely up to the user and the user agent which filesystem is being interacted with (regardless of whether it's remote or local, or remote-but-locally-cached) - and the developer doesn't need to care about it at all. They just treat their file/directory handles like they normally would.

There are likely a bunch of use cases that this approach would preclude, but the many-service-SDK wrapper might be able to cover those? At least for HTTPS-based services.

jimmywarting commented 1 year ago

Hmm, honestly this is not what i had in mind when i created #358 i have multiple cloud providers, like google drive and dropbox, but i don't have any cloud provider syncing the data with with any of my devices.

i only keep them in the cloud ☁️ (Only exception is for my android phone that uploads my photos to google photo to have backup)

So when i asked for #358, then this:

const [fileHandle] = await window.showOpenFilePicker(pickerOpts);
const cloudIdentifiers = await fileHandle.getCloudIdentifiers();

...was not what i had in mind.

I kind of just wished that there where some magical way a website that you have visited before could just magically register itself as a cloud provider (Maybe with a manifest.json + service worker?)

And the next time you called showOpenFilePicker() then you would have the option to choose to pick files directly from google drive.

my idea was for developer to not being able to know weather or not it's a local file handle, or a remote-but-locally-cached or a fully-remote file handle

the idea was to never have to install any desktop/phone application and just straight up being able to talk to services that you are logged in to in your browser.

the idea was to make a killer replacement for https://www.filestack.com (previously called filepicker.io) and how photopea did it. but in a more native browser solution that talked just simply over http calls

alex292 commented 1 year ago

This proposal was not meant to solve https://github.com/WICG/file-system-access/issues/358, I have actually never seen that issue before.

With this new web API, we are trying to solve the use cases discussed in the "use cases" section. I have been working with asully@chromium.org on this so far.

jimmywarting commented 1 year ago

I'm still confused as to why we even need this getCloudHandles() method... I don't think i would ever use it... I rather wished i got a normal looking FileSystemHandle and that i had no idea if i where writing things directly to the cloud as if i had been granted access to ftp://usr:psw@drive.google.com/pic/me.png or just a local file in the disk.

My intention with #358 was that you should never have to do any code specific things more dedicated towards google drive, dropbox, or OneDrive.

The hole point of #358 was to avoid having to learn how to use a Google Drive specific API and doing things like

  if (cloudIdentifier.providerName === 'drive.google.com') {
    // retrieve/modify the file from Google Drive API using cloudIdentifier.id
  }

then you must register a developer account at Google, get some kind of API key / token or something like that. learn how how google's api / SDK works. ask the user for a google app to get permission to read/write to the file. ask that the user is logged in to xyz so it could use the clients credentials, etc if you wish to do anything remotly useful with this file handle.

The other point #358 was also so that you don't have to create a virtual network drive on your own computer and having to sync each and everything. it was so that you can still be able to pick remote files from google drive, dropbox etc without having to install any desktop application

And the lightly hood that you get a cloud Identifier that your specific web app could understand and talk directly to is more unlikely. there exist hundred of different cloud provider. so every site would have to ad here to learn how to use each and every possible cloud api, where as #358 was trying to normalize the way of talking directly to a cloud provider without having to deal with lots of different code paths that could be taken.

Now if the usecase is to be able to easily share something that's already uploaded to a cloud provider wouldn't it then be easier to just use the Web Share API or something like that instead? allow the Web Share API to accept a FileSystemHandle? maybe it will instead share a link rather than uploading file.


learning how to use git, ftp, sftp WebDav, or smb is way easier to learn and that applies to multiple servers and it is way easier to learn then it's to having to dig deep into how any cloud specific docs/api such as how the OneDrive or Dropbox API works.

it's google drive, dropbox and others who should have to learn how to hook in their own cloud system into the OS (a.k.a file system access) as a network drive, it should not be the other way around.

and as such i think i'm -1 on this hole getCloudHandles() idea

i think remote files should almost works lite as how some site can register itself as a payment provider using the web payment api

alex292 commented 1 year ago

Our primary motivation for this new web API is the remote file handling scenario. I.e. device A runs a web app that opens a file from a cloud storage (primarly GoogleDrive / OneDrive) and then passes that file handle to a remote server where the file is fetched directly from GoogleDrive/OneDrive, without having to transfer the file between these two endpoints. To do so, we need to share some identifier for the file across devices, which would be the identifier we get from this new web API. Using the webshare API or simply transfering files to the remote machine using a websocket would already work as of today, but requires a file transfer. Imagine the file/directory you want to share being 5GB large, you likely don't want to upload that from your device, but instead quickly pull it to the remote machine from Google/Microsoft servers directly.

We have willingness from the Google side to implement this and we have willingness from partners to use this API. Additionally, this API also solves the de-duplication for online document editors (issue for Google/Microsoft) and allows for easier integrations as file attachments in web mail clients.

I see your request for a unified way to access cloud storage providers and think it is very valid, but also orthogonal to this PR.

jimmywarting commented 1 year ago

Imagine the file/directory you want to share being 5GB large, you likely don't want to upload that from your device, but instead quickly pull it to the remote machine from Google/Microsoft servers directly.

I see your point that taking something from google photos and uploading it directly to eg facebook would be a valid point in that you would not have to download it from google and then re-upload it to facebook.

But honestly i think that is something that could maybe technically be solved with WebShare API where sharing a remote FileSystemHandle could just be turned up as a uniq download link or something.

A hole other solution could be to create something serializable like FileSystemHandle.getUniqueId() whatwg/fs#46 so they can then be shared to facebook so it could reconstruct a FileSystemHandle in there backend (server) and then fb could pull the data directly from google drive by calling something like

it would be like handing dropbox a ftp link saying "here a uniq FileSystemHandle identifier that happens to look something like ftp://usr:uniqtoken@drive.google.com/my-summercamp-2022 it's uniq for just this path/location and it only contains read permission go and call FileSystemFileHandle.from('ftp://usr:uniqtoken@drive.google.com/my-summercamp-2022') to reconstruct a file handle it and pull it directly from google instead.

the point i was trying to make with #358 was that you don't always have the luxury of calling showOpenFilePicker() cuz they are not located on your drive. so with #358 then you would also be able to select something from a google drive picker UI. and then fb could easily reconstruct this handle if it knew a uniq file system id (aka ftp, smb protocol link).

cameyo commented 1 year ago

This would be great to integrate into our product. In our virtual app delivery platform, we support filetype association. So for example when the user double-clicks a .psd file, we currently read it locally through the JS handle, send it over to a cloud-based execution host and open it there. Then, we sync all changes back and forth to the user's device.

With this functionality, we'd gain a lot of speed and network usage, as the cloud execution host server would be able to open it directly from the Drive API (initial speed / network saving), and then the same time saving would apply for change syncing. This gain would be particularly interesting with >1MB files, which is often the case.

As a good "side effect", this will also resolve for us sharing and exclusivity issues -- i.e. what happens today when the user works within a cloud session on a local file, and then that file is moved, deleted or changed locally.

We already integrate with the Drive API, so having to go through JS file handles is a bit awkward currently. Definitely a big +1 !