Open bo0tzz opened 1 year ago
If I might be so bold as to add some unsolicited advice..
You're very close to the concept of a generic blob storage interface, where filesystem storage just a slightly weird looking blob storage API. Blob storage for large files, like images, is a very common design pattern for modern networked systems. Though my link hasn't been updated for a few years it's a good example of the way you may want to head in with your implementation. I suspect you can find more modern options for TypeScript out there, searching from my phone is difficult.
Your abstraction layer you described becomes blob operations, which then translate into actual blob API calls (filesystem write, s3 write, NFS share write, etc).
That's very helpful, thank you!
Does immich support S3 Storage currently? I saw a related pr was merged some time ago, but can't figure out how to set it up.
This is probably what you are thinking of: https://github.com/immich-app/immich/discussions/1683#discussioncomment-6206105
I just wanted to point to Apache OpenDAL which is used in the big data ecosystem quiet a bit. It is a unified storage layer supporting many different storage systems, among it s3 and local posix file systems.
It also has node bindings.
Feature detail
At the moment, Immich does not have an abstracted storage layer. On upload, files are stored in the semi-hardcoded library path with a randomly generated filename, their path is stored in the database, and in any future (read) operations this stored path is used (file serving, thumbnail generation, etc).
For several of the features we're meaning to (potentially) implement in the future (eg #34, #418, #451), it will be very helpful to refactor and abstract the storage layer. For some of them, like supporting multiple storage backends, it will be necessary entirely. In this issue I want to propose a design, although it will need some more discussion and refinement before it will be complete.
As mentioned above, currently the storage path for a file is generated once and stored in the database. I propose that we instead move to a model where storage paths are built on the fly based on the data we have for an asset. We already use some of that data to build the path on upload right now:
Instead, when trying to write or read an asset, the storage layer would expose a function for that which accepts the
AssetEntity
(or a more limited set of data, if desired). The storage implementation then uses that internally, together with some configuration, to build the actual path. That way, things like the storage path become an implementation detail that does not need to be exposed to the rest of Immich.I think it would be good to keep the storage providers as self-contained as we can, and avoid having it do things like access the database. Instead, it would take in a configuration when initializing (eg, the root path where to store files, S3 access credentials, or a template for the filename). That configuration can of course be read from the database by whatever code initializes the provider.
This will allow for a multitude of nice things:
tbd:
create
,delete
andstat
. How about something like S3, which might be able to provide URLs for direct access (bypassing immich)?Platform
Server