fntlnz / webfs

Apache License 2.0
12 stars 4 forks source link

Remote Storage #7

Open fntlnz opened 8 years ago

fntlnz commented 8 years ago

Rationale

The most important missing part of WebFs is the Remote Storage. I opened this issue to track down possible ways to implement this.

File / FileChunk

Files needs to be read and written in chunks, so I thought that File can be represented as a container of FileChunk like this:

file-filechunk

The File in intended as a Remote file, or better as a file that is handled by the Remote Storage.

The Remote Storage is that part of WebFs that handle file persistence/retrieval to/from remote storages (gist, pastebin).

FileChunk is the single piece of the file that is mapped to an uploaded file on the Remote storage.

Chunk considerations

If a File content when encoded is 1000 and we decide that the max chunk size is of 200 we will end up with a File containing 5 chunks of 200 characters each.

If a File content when encoded is 980, we will have a File containing 5 chunks with the first 4 chunks of 200 and the last chunk of 180.

Note: I haven't deliberately put units of measure, I still have to reason about it.

Local File => Remote File: considerations

On the Remote Storage, files are written scattered in different chunks, unfortunately the client (in our case the FUSE command line application) might not want the same chunk size so is necessary to keep a local buffer for write/read operations.

Write Behavior

On write, the part of the local file is initially written to a local buffer that is in charge of waiting for a complete chunk (matching the Remote chunk size) to be written to the Remote Storage by the configured Remote Storage Adapter. When the latest part of the local file is written the write is considered complete.

Read Behavior

On read, the chunks are read from the Remote Storage Adapter in FIFO mode to a local buffer which is in charge of waiting for a complete chunk (matching the Local storage chunk size) and then pass it to the Local Storage. When the latest chunk of the Remote file is given the read is complete.

Related things

A few attempts have already been made to implement the Remote Storage.

fntlnz commented 8 years ago

In the meanwhile I removed writeChunk from filesystem since it's more related to remote storage and the filesystem is not in charge of handling file storage.

I also removed storage things that have been thrown there without even thinking in order to do them better following this issue.