whatwg / fs

File System Standard
https://fs.spec.whatwg.org/
Other
226 stars 19 forks source link

Support reading file metadata #12

Open a-sully opened 2 years ago

a-sully commented 2 years ago

Migrated from https://github.com/WICG/file-system-access/issues/101 (see https://github.com/whatwg/fs/issues/2)

We should at least support the last-modified time and size, though we could consider other information such as the creation time.

For a file, it's currently not possible to get the size without reading in the file into memory via getFile(). For a directory, it's currently not possible to get the number of files in the directory without iterating through it (requested in https://github.com/WICG/file-system-access/issues/215).

jesup commented 2 years ago

You can't get the size without reading it in? What about getSize()? What's the usecase for these with regards to OPFS? (I can see perhaps being more relevant for the File System api, but I'm not so sure they're needed for OPFS). That said, these obviously could be supported if needed, but each added API/feature adds complexity.

a-sully commented 2 years ago

You're correct that getSize() now solves the file size issue, although AccessHandles are currently only available in workers. Creating an AccessHandle also has the side effect of acquiring an exclusive lock, which may not be possible if the site is writing using a FileSystemWritableFileStream (which has atomicity guarantees that AccessHandles don't, for example).

One specific use case for last-modified time (mentioned in https://github.com/WICG/file-system-access/issues/101) is watching file system changes. A specific API for this was requested here https://github.com/WICG/file-system-access/issues/72 (with a lot of upvotes) and sketched out in this doc, though this also seems more relevant outside of OPFS and it's unclear whether that new API would apply to files within the OPFS. If that API does not apply to the OPFS (assuming its developed at all) then last-modified time is the only way to watch file system changes. This seems like reasonable metadata to provide?

In general that I agree these metadata seem less useful in the OPFS than outside of it, but to me it seems there's enough reason to want these in the OPFS to support them eventually

jesup commented 2 years ago

Within OPFS the only way a file can change is if another handle (AccessHandle, SyncAccessHandle, or WritableFileStream) or directory operation (proposed move()) were to modify it, either from the same worker, another worker or mainthread in the same tab, or from another tab on the same origin. We can ignore external modifications to an OPFS file, I believe. I can see some use for a last-modified date (to compare against a server's copy's last-modified date for caching, for example). This could be written by the app into a separate file or by comparing hashes, of course. Using last-modified to watch for changes implies polling, and not-low-overhead polling, but maybe that's important for non-OPFS uses.

So I think the local cache issue seems a valid point in favor of lastModifiedAt() (or whatever). Are there good arguments for anything else? If we cared, we could have notifications on change for OPFS since we control the sources of such changes, but I don't think there is any significant usecase for this complexity.

jimmywarting commented 2 years ago

I found a round about way of getting last modified date out of directories... that just isn't possible with todays newest whatwg/fs api...

I have just learned that navigator.storage.getDirectory() is just the same as if you would call webkitRequestFileSystem(TEMPORARY, ...args). Both will point to the same bucket.

The good old entry api by Blink have a getMetadata function.

const webkitRoot = await new Promise(rs => {
  try {
    webkitRequestFileSystem(0, 0, x => rs(x.root), () => rs())
  } catch (err) {rs() }
})

function getMetadata(absolutePath) {
  return new Promise((rs, rj) => {
      webkitRoot.getDirectory(absolutePath, {}, h => h.getMetadata(rs, rj), rj)
  })
}

function getFileMetadata(absolutePath) {
  return new Promise((rs, rj) => {
      webkitRoot.getFile(absolutePath, {}, h => h.getMetadata(rs, rj), rj)
  })
}

For obvious reasons, you will not be able to get metadata out of user supplied directories/files from the picker.

(it will only work with navigator.storage.getDirectory() and in chromium browsers)

jimmywarting commented 2 years ago

For a file, it's currently not possible to get the size without reading in the file into memory via getFile().

currently building a file explorer at the moment, and i wish to use metadata instead of getFile() to avoid creating blobs added onto the browser internal blob storage. but i'm also interested in the files mimetype as well, so it would be a useful addition as well...

jesup commented 2 years ago

With SyncAccessHandle, you can call getSize() without creating a blob or reading it into memory. If we added a lastModified(), that would cover some of what you want. MimeType... while we could add the concept of a MimeType to OPFS files (though this might require some type of DB use -- chrome and Firefox are already using one; not sure about Apple), underlying filesystems don't necessarily have the concept as part of their metadata. I would lean against such a proposal.

jimmywarting commented 2 years ago

Browser seems to have something like this built in: https://mimesniff.spec.whatwg.org ? i wish there where some possible way to get access to mimesniff with some api... (a bit of topic doe)

annevk commented 2 years ago

Let's keep this issue scoped to metadata that is typically exposed across various file systems. (It seems reasonable to file an issue to consider exposing sniffing, but that's a fair bit of work and given that we don't necessarily want to extend it beyond what we need for the web it's unclear if that's a good pattern here.)

rektide commented 1 year ago

Would extended attributes also be something we could consider potentially in scope for this issue? Or should that be a separate issue for discussion

jimmywarting commented 1 year ago

kind of wish to have a metadata method right now...

A method that returns file size, file mime type, last modified date. and for folders: last modified date