Open farhoud opened 2 years ago
Glad you are thinking about this edge case. I guess there are two potential ways there could be a discrepancy -
Is that correct?
Standardizing the records on the graph protocol makes the API opinionated and it will welcome hacks on the developer side. Nonetheless, I think there is no way to come up with a standard for storing all the file-related data. As @farhoud mentioned, the use case here is to add some related records after the file uploading is complete. We have two different types of file-related records here:
We can add support for collections of files as some sort of metadata. This will help the application to create permanent collections and search among them if they want (or get the list of them for the file manager).
In conclusion, I think we cannot (and should not) prevent diverging of the data protocol and the file protocol states. What we can do is provide a way for making permanent collections of files. This should also have the support for nested collections.
@mehdibalouchi But if data protocol state diverges from file protocol state, for example, if someone erases the files but database is still there, or someone erases the database but files are still there, what are the risks? How can they be synced again?
@ehsan6sha these issues are also present when you are using a traditional DB with a web2 arch.
if someone erases the files but database is still there
If you are storing some file paths (a pointer to a file on a file system) in a database and then for some reason the records get removed from the DB, there is nothing the DB can do about it. A common solution to this problem for centralized databases is to store the file inside the database records (e.g. blob), this is currently possible with existing APIs (with a low performance because orbitDB keeps all the data inside the RAM).
or someone erases the database but files are still there
In this case, we will have some orphan files (as @farhoud mentioned). By keeping the file and the graph protocols completely independent, we can ensure that any request for a file is made through the IPFS network. Therefore we can rely on the IPFS garbage collector and pinning service to handle the orphan files.
How can they be synced again?
The application is responsible for keeping the data protocol and the file protocol states synchronized. This is where permanent collections can help. An application should be able to create a permanent collection (with a permanent id) and store files in it. At any time if the application wants to resync the states, it can search through all the files in the collection and update the data protocol state.
The collections should have a human-readable identifier (e.g. todo-app-XXX/photos
). The union of [USER_ID, APP_ID, COLLECTION_ID]
must be unique over the network
@ehsan6sha @keyvan-m-sadeghi @mehdibalouchi
We can use UnixFS and encode our folder structure inside ipfs (not orbitdb).
store unix path of file on Orbitbb. this way diverge are detectable.
also app like Drive can use data structure to show files.
Or inspiring it and also apply security to it.
Never mind it is What MFS uses. and replicating dag is harder than it looks.
@mehdibalouchi the thing is for central web2 model, it is system admins who have access to the data. So divergence will be minimized and will be handled by an expert. Here it is the public who has access and divergence can be a killer to the experience
Overview
Uploading a file in application need two step:
fileprotocol
graph-protocol
. (like getting list of files)Known Issues:
Scenario ending up with a split ed state:
File manager like Google Drive
The record that keep cid is not standardize which make it hard to get list of the file on box without knowing the implementation of the app that is stored it.