Make explicit that FileSystemHandle entries map to paths

a-sully commented 2 years ago

Be more explicit in describing how FileSystemHandles map to a path on disk, and the implications of this choice. Consider the following cases:

Clearing site data

// Create file at /dir/file.txt.
const dir_entry = root.getDirectoryHandle('dir', { create: true });
const file_entry = dir_entry.getFileHandle('file.txt', { create: true });

// ======== Clear site data, clearing all data in the OPFS. ========

// file_entry still points to /dir/file.txt. The promise rejects with NotFoundError.
await dir_entry.getFile();

// Re-create the directory in the fresh OPFS.
const dir_entry = root.getDirectoryHandle('dir', { create: true });

// file_entry may be able to write /dir/file.txt, despite the file not being explicitly created
// Whether this works may depend on the user agent's implementation?
await file_entry.createWritable();

Moving a file handle (see https://github.com/whatwg/fs/pull/10)

// Create file at /old/file.txt.
const dir_entry = root.getDirectoryHandle('old', { create: true });
const file_entry = dir_entry.getFileHandle('file.txt', { create: true });

// The file previously pointed to file at /new/file.txt now resides at /old/file.txt.
await dir_entry.move('new');

// file_entry still points to /old/file.txt. The promise rejects with NotFoundError.
await dir_entry.getFile();

// Restore the directory name. file_entry once again points to a valid path.
await dir_entry.move('old');

// file_entry still points to /old/file.txt. This call should succeed.
await file_entry.getFile();

Removing a file handle (see https://github.com/whatwg/fs/pull/9)

// Create file at /old/file.txt.
const dir_entry = root.getDirectoryHandle('old', { create: true });
const file_entry = dir_entry.getFileHandle('file.txt', { create: true });

// Remove the file.
await file_entry.remove();

// We can still write to a removed file.
await file_entry.createWritable();

Preview | Diff

annevk commented 2 years ago

So we already state that entries have a name, but it seems that isn't the full location. Can we give entries an additional field, called "path" or "location" that is a string or byte sequence or implementation-defined type representing a location on disk? And then we build the other things on top of that. I typically find that making the conceptual model explicit in terms of data structures makes it easy to build processing models on top.

a-sully commented 2 years ago

We've had a number of requests to expose the full path on the non-standardized portion of the API (https://github.com/WICG/file-system-access/issues/178, for example), but for privacy reasons I don't see us ever exposing a path for handles outside of the OPFS.

That being said, I don't think a full location is something we'd want or need to expose within the OPFS, either. A relative path is already available by calling resolve() with the handle and navigator.storage.getDirectory() so I'm not sure if I see the value of providing this on the object itself?

There's also the question of how this path would interact with Buckets. For example, multiple handles could have the same path but still be distinct files because they're in different buckets (unless the path includes the bucket somehow, which is a can of worms I assume we don't want to open). The advantage of using resolve() is that you have to explicitly pass in what the handle is relative to.

On that note, I'm open to renaming resolve() to something like getRelativePath() (see https://github.com/whatwg/fs/issues/4)

a-sully commented 2 years ago

https://github.com/whatwg/fs/issues/14 is somewhat related

annevk commented 2 years ago

I'm saying that we add it as a (private) member to the "entry" concept. Not that we'd expose it through a public API.

a-sully commented 2 years ago

Ah I see. I did not realize we could spec a private member.

There's also the question of how this path would interact with Buckets ... unless the path includes the bucket somehow, which is a can of worms I assume we don't want to open

It sounds like this member SHOULD include the bucket information. Assuming every new bucket has a unique ID (which seems like something we could spec?), this would handle the "clear site data" case above (or future "clear bucket" mechanism), since it would guarantee that handles become invalidated (well, at least always return NotFoundError) alongside their respective buckets (since the path/location will never be re-created).

Should we prefer "location" to "path", since this generalizes better to things like in-memory file systems and makes it clear that this is not a path relative to a bucket, but a location identifier which should include bucket information?

Overall this seems reasonable to me as long as we're okay with the implications for move() and remove() mentioned above. @mkruisselbrink any other thoughts here?

whatwg / fs