I hate losing data. While I also am not a fan of data models that can represent inconsistent state, functional necessities beats philosophical design preferences.
As I try to build version 0.0.1 features, I'm increasingly being confronted by a need to retain certain group level properties.
Properties that are undeniably for the group and not the file.
If we are just talking about file level properties then xattrs would prove sufficient.
Group level data can't be handled in the same fashion.
Scattering it among the constituent files is a no go as doing so means:
N different ways the data can be inconsistent.
Should it ever be updated, there will be N different values to update
File systems don't support transactions across file boundaries
Necessity is derived from the fact that the import process irrevocably loses data in the form of losing progenitor file's name & attributes. Which means stores as they currently exist are stateful entities. As the system does not have all the necessary data to regenerate stores. I would much prefer them to be like SQL views. Something that can be regenerated at will from a single source of truth (the mono-collection)
The storage dimension of the problem stems from the fact that groups has no physical entity to which data can be anchored.
A candidate solution is to create directories for them and set xattrs on the directory.
The biggest drawback of this solution is that we absolutely lose the ability to use the by-order index as a "see-all" directory. But then again its contents has no file extensions so maybe it's all for the best?
Another candidate solution is to just keep meta-data files for groups. Each file contains a list of its member files by ID.
The biggest issue of this solution is that if the files change, a gigantic scan needs to happen where every one of these meta-data files must be scanned.
Gains for this approach is equally unimpressive. An entirely speculative reduction in storage concerning directory structure while ensuring increased inode usage and being forced to trade off readability or storing file identifiers in an inefficient encoding.
I hate losing data. While I also am not a fan of data models that can represent inconsistent state, functional necessities beats philosophical design preferences.
As I try to build version 0.0.1 features, I'm increasingly being confronted by a need to retain certain group level properties.