citizenos / ep_image_upload

Add images to etherpad and upload them to Amazon S3
Other
9 stars 16 forks source link

add script to move images from pool to file system #44

Open webzwo0i opened 3 years ago

webzwo0i commented 3 years ago

Not sure if this is something you want to include, but I wrote a script that is useful when people switch from in-pool storage (default) of images to on-disk storage. It will decode the attributes, store the file on disk into the directory that's configured in settings.json and rewrite the pool. I used it to decrease the database size of https://github.com/ether/etherpad-lite/issues/4642 as in recent Etherpad version the pool is written in every key revision, i.e. a pad with a long history of 50000 revisions that has 10MB of images in the pool has a size of 500*10MB = 5GB.

It has two limitations: It won't work for S3 and it will delete pools from key revisions in case an image attribute is rewritten. This is mainly due to the fact that the pool of a key revision does represent the pool at the time the revision was written. In case you only want to merge this when I fix this, I think I can do it, i.e. iterating over the pool of key revisions and rewrite the attributes when needed instead of deleting the whole pool).

Update: Forgot to mention that it will only consider specific mime types, so depending on what people whitelist in settings.json they also need to add those mime types to this script. In case you want me to remove all non-image mime types I can do that. (The reason I included them is that it's very easy for people to add several non-image mime types that'll be stored in the pool and the goal of the script is to reduce the database size.)

Update 2: Make backups! :-)