Hybrid File System, persisting only changed files

DerKoun commented 9 years ago

There is an issues at em-dosbox about persisting save games without persisting the whole games: https://github.com/dreamlayers/em-dosbox/issues/4 The goal should be to persist changed and new files (and, ideally, deleted ones). Other ones should be read from the primary non-persistant FS. I quickly hacked together one based on the local storage and in-memory FSs: https://github.com/dreamlayers/em-dosbox/issues/4#issuecomment-71932465 However someone that actually knows typescript would have to complete it (if the basic idea is even correct). I hope such a hybrid FS is possible, as it would improve emscripten ports like em-dosbox greatly, by allowing users to save.

jvilk commented 9 years ago

@DerKoun: Yes, a hybrid FS is possible, but you need to figure out how you want it to function. I would recommend not using localStorage, as that gives you 5-10MB of storage for an entire domain (e.g. for all of archive.org).

The best option for most Emscripten use cases is to use a large asynchronous storage device to persist data (e.g. we have backends for Dropbox, IndexedDB, and HTML5FileSystem that are excellent candidates).

Here's how I imagine it would best function:

1) Construct the file system with two arguments: A constructed asynchronous file system backend, and a directory within that backend to use for storage (e.g. /dosbox/windows3_1). 2) Copy all of the files already in that directory into an in-memory file system. These files are from the previous session. 3) Proxy all of the file system operations to the in-memory file system, and the asynchronous file system. Since the in-memory fs is synchronous, you should use its return values. * Note: You need to be careful about ordering these. Because asynchronous operations complete asynchronously, Emscripten can open a file and write to it before the asynchronous backend has finished opening the file. Have a work queue, and have each asynchronous callback check the queue for further items.

And that's it!

I'm completely overwhelmed with other work right now, but I would happily code review / shepherd your work into the BrowserFS repository. Just fork the repository, start working on it, and open a standing PR where we can discuss issues and you can ask questions.

There's a rich testing harness you can use to continually test your changes; all you need to do is write a simple factory class that gives the test harness one or more instantiated file systems to test, and you're a simple grunt test away from testing your file system.

What do you think?

DerKoun commented 9 years ago

I'll have to learn some TypeScript basics first and set up an environment for that. But I'd really like to look into hybrid or otherwise tweaked files systems. So, I'll get back to you once I'm set up.

jvilk commented 9 years ago

Sure, no problem. TypeScript compiled directly to JavaScript in a straightforward way, so you probably won't have much difficulty. You might have more difficulty figuring out how best to write the file system given the interfaces I've defined.

perimosocordiae commented 9 years ago

Seems like a pretty straightforward way to do this would be a "shadowed" FS backend:

Take any two existing FS backends, call them Base and Shadow. For every FS operation, query Shadow first. If it exists, complete the operation and be done. If not, perform the action on Base. If the operation changes the file in any way, copy the file from Base to Shadow first.

You'll incur some overhead by checking Shadow first every time, but if that's an issue you can cache information about which files live in Shadow in-memory.

jvilk commented 9 years ago

@perimosocordiae You can't use that technique, unfortunately. Emscripten only uses synchronous file system operations. You cannot have any file operations depend on an asynchronous backend.

The only way to proceed is to load up everything the program will need from the asynchronous backend into the Shadow backend, and work from that entirely. All actions must be performed on both backends to keep them in sync, but the result can only be used from the Shadow backend because the Base backend is asynchronous.

We're working in a really weird space that ensures that you can't adopt many existing techniques.

perimosocordiae commented 9 years ago

Ah, that explains the async work queue idea. It won't actually be too bad, considering that only mutating operations will need to be duplicated.

timdream commented 9 years ago

Please read my WIP branch of Emscripten also posted on that issue. https://github.com/timdream/emscripten/compare/kripken:incoming...timdream:save-to-idb?expand=1

My alternative approach is to utilize IDBFS offered in Emscripten directly -- so we shouldn't be need to patch anything here or in em-dosbox.

I don't know my patch will get accepted or not -- I plan to submit the pull request in coming days when I am available for this work again.

jvilk commented 9 years ago

@timdream: That's nice. However, the benefit to implementing a HybridFS in BrowserFS is that you are not limited to just IndexedDB storage, and can use any current or future BrowserFS backend. For example, we have a Dropbox backend that could be dropped into the hybrid file system to take advantage of cloud storage.

timdream commented 9 years ago

@jvilk I see. Thanks for the insight.

jvilk commented 9 years ago

I forgot, I documented my thoughts in #100. Mentioning here to link the two.

dreamlayers commented 9 years ago

This thread has convinced me that BrowserFS is the best place for this functionality. It should give the best flexibility and reusability. I have created this branch starting with code by @DerKoun. Right now it compiles and runs, but enableLs() causes only the local storage files to be visible. I'm in the process of learning more about BrowserFS so I can fix that.

jvilk commented 9 years ago

Let me know if you have any architectural questions about BrowserFS, or general TypeScript questions.

I would seriously consider not limiting the file system to localStorage and abstracting things away so that you can use the larger, asynchronous stores, since localStorage only offers <10MB of storage for an entire domain and is not available synchronously from within a WebWorker. But I understand that getting "something" to work quickly is better than nothing.

dreamlayers commented 9 years ago

Yes, I realize that method has limitations, but I wanted to start with something simple because I've never worked with BrowserFS or TypeScript.

This may not be possible to reasonably do on the SyncKeyValueStore level. Directories are a problem, and that level knows nothing about them. The use of random IDs means persistent directories couldn't refer to non-persistent files. In any case, it would be better to have separate directories in persistent and non-persistent file systems, which only list files in their file system.

Would it be best to do this on the file system level (ie. individually handling functions like openFile()) Maybe separate code could be used to present an async file system as a sync file system?

jvilk commented 9 years ago

I would recommend doing it at the FileSystem level. Look at my WorkerFS for inspiration:

https://github.com/jvilk/BrowserFS/blob/master/src/backend/workerfs.ts#L332

For your use case, you should extend SynchronousFileSystem and only implement the *Sync variants of all of the functions. The SynchronousFileSystem will handle mapping the asynchronous API variants to the synchronous variants.

SynchronousFileSystem: https://github.com/jvilk/BrowserFS/blob/master/src/core/file_system.ts#L913

dreamlayers commented 9 years ago

Here's what I've done so far. It's at a very early stage, but this does actually work with Em-DOSBox. I could create and change files, with them persisting in local storage. They appear in the directory together with files from a zip file.

DerKoun commented 9 years ago

Just a quick update from me: Got a TypeScript environment set up, but got paused by the flu. Seems you guys got the synchronous hybrid file system done while I was ill. I guess you'll also be able to make the leap to mixed/asynchronous ones. Once I've recovered I'll start looking into a different issue I thought about: A ZipFS that downloads data lazily.

jvilk commented 9 years ago

If you're thinking of lazily loading chunks of a ZIP file... note that Zip files have their TOC at the end of the file at an unknown offset. You have to read through the entire file to get to it reliably, or start at the end and scan backwards for the magic number and pray that the magic number doesn't appear in the data.

Why at the end? Spanned archives on floppy disks. :smile:

jvilk commented 9 years ago

Also, btw, I recommend the atom-typescript plugin. It integrates with TypeScript's language service to give you intelligent autocomplete, and tells you when code fails to typecheck.

dreamlayers commented 9 years ago

I have added more features. Right now UnionFS should support: file overwriting, changing via copy on write, creation of new files and creation of directories. It will automatically create necessary directories in the read-write file system so newly created or copied files can be placed there. Paths in the read-write file system start with a prefix, which allows multiple web pages within the same domain to keep their files separate.

The File objects returned by openSync() are those from the two file systems being combined, so actions done to a file automatically go to the correct file system without any code handling that. I don't know if this will have to change.

There is no support for deletion yet. To make it fully work, there needs to be a way to make files present in the read-only file system appear deleted. I'm thinking an object with mappings from path could work { 'dir/file': 'D' }. Such an object could also help support renaming without copying by creating a zero-length file and setting a mapping like { 'dir/newname': 'Rdir/oldname', 'dir/oldname': 'D' }. This object could be stored as a file in JSON format. Though, for the purposes of supporting DOS games, simply supporting deletion of files in the read-write file system is probably enough.

Code is at: https://github.com/dreamlayers/BrowserFS/blob/unionfs/src/backend/unionfs.ts I hope I'm on the right track with that. Please let me know if you think there are fundamental problems which would require a major redesign.

dreamlayers commented 9 years ago

It should be possible to get the ZIP TOC by first doing a HEAD request, getting the file size, and then downloading a chunk at the end of the file. Then you could download other chunks as needed to decompress individual files.

jvilk commented 9 years ago

That's the problem; the chunk at the end of the file is of variable-size, since the size of the TOC is proportional to the number of files in the ZIP file.

jvilk commented 9 years ago

It looks like you're on the right path, btw, regarding the unionfs. I wish I had a cleaner openFile abstraction.

I wouldn't use the File objects returned from the other file systems. I would translate open calls into readFileSync calls, and wrap the data into a new File object for UnionFS. The File class has a close and sync method on them that is called when the file is closed. If a file is open with a write mode and the program writes to it, the writes won't be visible to the filesystem until the file is closed. Each writable file system extends a base file class, and implements those two methods.

I cannot remember if I set a 'dirty' bit on the File when it is modified, or if BrowserFS merely writes back the file every time it is opened and closed. If the dirty bit is not there, it would be a good idea to add it to the base file system class so all file systems get this optimization.

dreamlayers commented 9 years ago

What is the actual problem with returning File objects from the two underlying file systems? The only issue I can think of is if a file is opened read-only, and then opened a second time for writing. The read-only user will keep using the file in the read-only file system, and the write user will keep using the file in the read-write file system. This means the read-only user won't see any changes.

BrowserFS in general does not seem to be designed for this usage. LocalStorage uses SyncKeyValueFile which reads the whole file when opened, and only writes changes on syncSync() or closeSync(). In openFileSync(), a new SyncKeyValueFile is always returned, so multiple opens of the same file will return separate file objects.

I was thinking of creating a file object which just routes operations and does copy on write. If a file exists only in the RO FS, read operations will be routed there at first. Then when the file is modified, copy on write happens, the RO file is closed, the RW file is opened, and all operations go to the RW file. This would avoid having to copy file contents from the RO FS to UnionFS every time a file is opened. However, it is pointless if the RW file system doesn't make changes visible.

I see no dirty bit. In SyncKeyValueFileSystem even read-only file opens lead to rewrites when the file is closed. I agree adding that would be an improvement.

jvilk commented 9 years ago

When you sync or close a file, it calls the sync or close method on the File object, which communicates with the backend it originated with.

So if you open a file, modify it, and close it, it will sync with its originating file system -- which is not the hybrid file system. You'd want to wrap the file object, and only write it to the writable file system if it is modified.

All BrowserFS file systems currently use NFS-style "sync-on-close" semantics. This is not a hard requirement, but I haven't had any incentive to do anything else.

I see no dirty bit. In SyncKeyValueFileSystem even read-only file opens lead to rewrites when the file is closed. I agree adding that would be an improvement.

Yeah, that would be an improvement. Perhaps we can modify the File interface to have a method that exposes the dirty bit, e.g. isModified, letting you nicely wrap any file object passed your way from the file systems you wrap.

dreamlayers commented 9 years ago

Currently there is absolutely nothing to be done in UnionFS when you sync or close a file. UnionFS does not have any internal state which needs to be changed or persisted. It does not ever directly store file data itself; all data is in the two file systems which it is joining. Only the read-write filesystem needs to do something in response to those operations. (The read-only file system could update the file's access time, but I don't think that is implemented in any file system, and it's not needed.)

What should UnionFS do in response to sync or close? I know you suggested modifying things so UnionFS keeps a copy of opened file data and writes modified files to the read-write file system on close. What would be the benefit of doing this?

The only benefit I can think of is not doing copy-on-write when a file is opened for writing, and instead only copying when an actual write occurs.

jvilk commented 9 years ago

I see. You're eagerly creating the file in the RW file system when it's opened with a writable flag, so it's not an issue.

It will be an issue when you add support for an async backing store, which is when you'll need to intercept sync/close events and proxy them to two places.

dreamlayers commented 9 years ago

I understand. It is impossible to deal with an async file system the way I'm currently doing it, because data isn't available after the initial call and is only available after a callback. How can I deal with this?

For example, consider a scenario where a game program is in a ZIP file and saved game files are in IndexedDB. When the game is running, it would expect to be able to call fopen(), call fread() and have the data available. The only way to make this work seems to be to load everything that might be needed from IndexedDB into an InMemoryFileSystem or similar functionality within UnionFS. Loading would be done via a single async operation at startup, via readdir(), starting from the root or prefix given to the UnionFS constructor, loading from IndexedDBFileSystem. Then sync and close can work, although the syncronous API won't be able to report any errors from the async file system because those are reported via callbacks. Since a copy is kept in memory, any written data will be accessible via the synchronous API. The UnionFS API should provide a way to set up a callback where async errors can be reported.

Another alternative would be to use Emscripten's emterpreter to make asynchronous operations appear synchronous. This should work well for Em-DOSBox, but some programs may have serious performance penalties because all functions on the stack when an async operation occurs need to be interpreted by JS instead of being compiled to JS. I also guess this isn't a good option because it would involve deep integration with Emscripten and would be Emscripten-specific.

jvilk commented 9 years ago

The only way to make this work seems to be to load everything that might be needed from IndexedDB into an InMemoryFileSystem or similar functionality within UnionFS.

This is exactly what I was proposing.

Then sync and close can work, although the syncronous API won't be able to report any errors from the async file system because those are reported via callbacks.

There shouldn't be any errors, because you are keeping the file systems in lock-step, and (hopefully) nothing else is modifying the file system. You will need to handle creating a FIFO queue for async file system operations to ensure they occur in the same order. There's no hard guarantee that async file systems will complete operations in the same order that they are received.

If there is an error... you could delete the file system, clear the FIFO queue, and re-write it with the contents of the in-memory file system. Or report it to the user as an error.

dreamlayers commented 9 years ago

You will need to handle creating a FIFO queue for async file system operations to ensure they occur in the same order. There's no hard guarantee that async file systems will complete operations in the same order that they are received.

I assume this means it would be wrong to start a second operation before the callback from the first operation returns. When the callback from one operation returns with success status, you know that operation is done and the next operation from the queue can be started. I assume this can be done from the callback function or a function called from it without any recursion depth or concurrency issues making it impossible or inadvisable.

If there is an error... you could delete the file system, clear the FIFO queue, and re-write it with the contents of the in-memory file system. Or report it to the user as an error.

I guess the most probable error would be running out of quota, though it seems IndexedDB quota is quite large. I think it would be okay to at first just report the error via a callback, and leave more complex handling for later.

It seems both local storage and IndexedDB are per origin (host, protocol, and port combination). Multiple pages in a single origin may need their own storage. Putting it all in the same place would create a mess and possible file name conflicts. Because of this, I made one of the constructor parameters a prefix for paths in the read-write file system. I hope this is a reasonable choice for separating data from different pages.

For example the game Doom is set up with the prefix /doom/. It runs in the root directory and saves a game to /doomsav0.dsg. It ends up written to /doom/doomsav0.dsg in IndexedDBFileSystem but is seen by Doom as /doomsav0.dsg. Doom sees anything in /doom/ as being in its root directory, and it doesn't see anything in the real root or other subdirectories of root in IndexedDB. Another game could similarly have its own storage in /game2/. UnionFS only loads into memory files located under the prefix given to the constructor.

Since only the files needed by one web page are in memory, the entire filesystem cannot be rebuilt from files in memory. Hopefully, errors other than hardware failure won't trash everything.

jvilk commented 9 years ago

I assume this means it would be wrong to start a second operation before the callback from the first operation returns.

Correct. While I actually think that the implementation of all of the browser async interfaces are totally serial, it's not guaranteed by the spec and is subject to change. (And it's probably not true for Dropbox.)

When the callback from one operation returns with success status, you know that operation is done and the next operation from the queue can be started.

Yup! That's exactly what I've been advocating for.

I assume this can be done from the callback function or a function called from it without any recursion depth or concurrency issues making it impossible or inadvisable.

I believe I have carefully constructed the code to always reset stack depth via setImmediate or another alternative to avoid stack overflows. It should be handled in the node_fs.ts file, where we wrap the user-provided callback in that logic.

It seems both local storage and IndexedDB are per origin (host, protocol, and port combination).

Correct, but:

IndexedDB lets you have multiple databases per origin.
Even if it didn't, IndexedDB is as big as you need it to be, as long as the user gives consent for the desired quota.

I guess the most probable error would be running out of quota...

You can expand your quota, I think.

Since only the files needed by one web page are in memory, the entire filesystem cannot be rebuilt from files in memory. Hopefully, errors other than hardware failure won't trash everything.

Yeah, in your scheme, you would wipe out the /doom directory and rebuild.

jvilk commented 9 years ago

I'm working on the synchronous version in the overlayfs branch, which would allow you to marry a zip file to e.g. localStorage or in-memory storage:

https://github.com/jvilk/BrowserFS/tree/overlayfs

Once that's done, I'll do an async version so you can mirror changes to asynchronous storage.

For the IA case, I'm imagining doing the following:

OverlayFS(zipFile, AsyncMirrored(inMemory, indexedDB));

In other words, changes to files in zipFile (+ new files) are written to the async mirrored file system in the manner that I proposed above.

jvilk commented 9 years ago

I should mention that I'm doing this since @dreamlayers seems to have stopped, and his approach was hardcoded to the localStorage/zipFile case. So I hope I'm not trampling on anybody's toes here!

timdream commented 9 years ago

Woot! That's exactly what I was going to propose -- an way to marry sync filesystem with an async one. That should solve the archive.org use case perfectly :)

jvilk commented 9 years ago

This is done, and works. See this message for details.

jvilk / BrowserFS

Hybrid File System, persisting only changed files #102