Doloops / mcachefs

mcachefs : Simple filesystem-based file cache based on fuse
64 stars 15 forks source link

a couple of question related to metafile and journal! #28

Open hradec opened 3 years ago

hradec commented 3 years ago

questions:


  1. what do you think about using sqlite to store metadata and journal?

using sqlite would make things easier to debug and maintain the metadata and journal of mcachefs. Also, sqlite has 3 different multi-thread modes, including full parallel query/update.


  1. would be possible to get some documentation on the metadata structure and manipulation functions?

It's been pretty hard for me to get my head around how the metadata/journal works and where they are updated and query.

If you could write down a bit of how the metadata and journal files are structured, and where in the code they are created/updated and just queried, it would help me a lot.

There's a lot of things I wanna write to manipulate the metadata and journal outside mcachefs, for debugging and management purposes (including the separated thread that updates the metadata after a timeout independently from access, so we can retrieve backend changes without a re-mount), but it's been kinda impossible to advance... It feels like the metadata/journal is spread over the code and I can't get my head around it!

There's also extra information I would like to add to the metadata (or retrieve from the journal maybe) so mcachefs can update an already cached file only when it changed on the backend, but not when its changed locally. Right now, I have a change that will retrieve the backend copy of a file if mtime is different from the local, independently if the local copy was modified locally or not (checking if the local mtime is newer doesnt work either because the backend can have a newer mtime if the file was modified after the local cached file was).

I guess I could find if the local copy was modified locally from the journal, but I'm not sure how to query that yet.

I also want to implement some functionality to selectively send local modifications/creations of files back to the backend (for example, an "apply_journal" that will only send "*.exr" files)... again, I need more info on how the journal is queried/structured. In fact, I'm not using mcachefs to copy data back to the backend at all (apply_journal), because I don't trust it enough yet and I don't want to loose data on the backend.

anyways... some info on those two would be amazing, if possible!!


Conclusion:

I'm thinking that switching to sqlite would essentially make it much easier to debug and maintain the code and functionality, and most important, it would be extremely easier to implement new features and parallel maintenance of the cache.

But even for me to give it a try by myself and implement sqlite as a test to compare, I'll have to expend a lot of time to figure this whole metadata/journal functionality out. If you could document it just a bit, would be a great help!

thanks again... -H