Open asitemade4u opened 2 years ago
Here is how PHP Generator (which is quite thorough, as per usual) handles this request: External Image
Thanks @asitemade4u. There's definitely a case for externalizing attachments. Do you have any thoughts of how backups and snapshots should work, should they include external attachments or keep that separate?
Here are my ideas on the subject:
Related need here: we'd like to have an attachment column for hundreds of lines with documents sometimes > 500Mo (paper scans…), so in an ideal world, we'd like to store them in a separate service (like Minio). @paulfitz This is something we could try to work on, do you have some inputs on how this could be implemented ?
Grist documents contain a _grist_Attachments
table with metadata about individual attachments, but not their contents:
https://github.com/gristlabs/grist-core/blob/7dc49f3c850ea6cf7f7832d069088c36a200b93b/app/common/schema.ts#L145-L154
The fileIdent
column is a key to a separate _gristsys_Files
table, that contains attachment contents:
https://github.com/gristlabs/grist-core/blob/94a7b750a8db2421174e671565cbe185e1067dbe/app/server/lib/DocStorage.ts#L73-L77
I'd suggest tweaking the handling of _grist_Attachments
so that it can represent attachments that are stored externally. That could be by extending the meaning of fileIdent
, or by adding an extra column or columns.
As a practical matter, it would probably be necessary to continue to support in-document attachments, to avoid disruption to existing Grist installations and document backups.
There have been requests for an attachment-like UI that works with link-like attachments, e.g. to videos etc.
There are decisions that would need taking about management of external storage. For example: what happens if attachments are deleted within a document - should that delete the externally stored attachment? Likewise (and related), what does copying a document mean now, should external attachments also be duplicated? Life is easiest if (as @asitemade4u suggested) Grist just doesn't get involved in the lifecycle of external attachments at all, except (perhaps) in their initial creation.
For the UI: I suspect presigned URLs would be the way to go for uploading and viewing.
There's a lot more to say, there are a lot of options and it would not be a small project, but I think it could work out quite nicely.
How about adding filePath, to _grist_Attachments? a null path could be interpreted as being stored in the database.
An easy way to handle the uploads may be /attachments/\<DOC ID>/\<fileIDENT>.\<ext>
While backwards compatibility is a perfectly good reason to keep existing behavior, I also believe the new behavior should be made the default, with an environment variable or configuration setting available to revert to blobbing if desired. My reasoning is that keeping attachments as files would eliminate a lot of headache when working with attachments in custom widgets and formulas.
If you are worried about files modified elsewhere being loaded in the doc, could compute an MD5 of the upload and store it as fileHash and try and match the metadata to ensure the file is unchanged. In this case, disabling rather than removing modified attachments from the grist UI may be good, with a a way for the user to override this disable for particular files. A boolean like allowExternallyModified could work. @paulfitz
From what I understand of Grist's philosophy, I get that you wish to keep all data in a single SQLite file -- no risk of database discrepancy that way. However, some tools such as PHP Generator, Obsidian, or any web server for that matter, allow to fetch then display an image using its URL or path on disk (relative to the folder of the datafile). I think it would be a valuable idea for Grist as: