sismics / docs

Lightweight document management system packed with all the features you can expect from big expensive solutions
https://teedy.io
GNU General Public License v2.0
1.98k stars 489 forks source link

Inconsistencies with files size when using encryption? #701

Closed archiloque closed 1 year ago

archiloque commented 1 year ago

Hello,

It seems there are inconsistencies / bugs in how code deal with file size and encryption.

What seems to be happening

1) In com.sismics.docs.core.util.FileUtil#createFile we use the unencrypted file size to update the user quota which seem the right approach. On the other hand, in com.sismics.docs.rest.resource.FileResource#delete it seems that when a file is deleted it's the encrypted file's size that is used, which means the quota would be incoherent.

2) In com.sismics.rest.util.RestUtil#fileToJsonObjectBuilder which is used when a file size is returned by the API the code use the encrypted file's size, which feels wrong to me.

Questions

1) Is my analysis correct?

2) If yes what do you have a suggestion about it?

We're not using the encryption feature so we're not impacted by the bug, but I'm interested if this bug leads to storing the files' size in the database, because it would mean not accessing the filesystem when looking for files information, and as we look for lots of files when using the API I suspect it could have a performance impact.

jendib commented 1 year ago

You are right, and the obvious solution would be to store the file size in database (the file entity is immutable anyway). This should be done at the same time as https://github.com/sismics/docs/issues/303 to decouple the storage solution from the rest of the code.

archiloque commented 1 year ago

Thanks for your answer, I could work on the "file size in database part" but the S3 part is out of scope for us so it would be harder to me to contribute on it.

archiloque commented 1 year ago

Rethinking of this issue and I wonder if you have an idea how to deal with the migration of the existing files since filling the database would means accessing all the files, should it be done lazily (like when we access a file with a yet unknown size we fill the info in the table) or do you have another idea?

jendib commented 1 year ago

It could be a background job to update all files with a progress menu just like what Gitlab is doing.