algoo / preview-generator

generates previews of files with cache management
https://pypi.org/project/preview-generator/
MIT License
228 stars 50 forks source link

aws s3 support #202

Open mosi-kha opened 3 years ago

mosi-kha commented 3 years ago

hi at first, thanks for this awesome package. I'm working on a web app project that uses Amazon S3 for keeping files. now, I want to create a preview for each file in S3. do you have any ideas on how can I do this? creating preview without download each files.

grignards commented 3 years ago

Hello and thank you for using our package.

Using S3 stored files without downloading them seems to be feasible by leveraging AWS Lambda which can execute Python code (thus use preview-generator). So if it is applicable for your use-case I'd suggest to go this way.

If not using AWS Lambda I'm afraid I do not know of a way to generate previews without downloading the files, either directly with the S3 API or with a Fuse filesystem to access S3 files as in a normal filesystem.

If you have any other questions/issues or ideas of improvement for preview-generator, don't hesitate to ask or propose them.

mosi-kha commented 3 years ago

@grignards thanks for your answer. unfortunately, I can't use AWS Lamba, our provider just can deliver S3 service. do you have any idea about creating a preview with streaming? because I use multipart upload (chunks) and chunk size is 5MB and at the first, chunks save in buffer (ram) then upload them.

grignards commented 3 years ago

In its current state preview-generator cannot generate previews of streaming content as it relies on several packages/programs which do not necessarily provide a way to read streams. I understand that uploaded files are going through your web application. If this is the case you could temporarily save the stream contents on disk, generate the preview then discard the temporary file. Also do you buffer all the chunks in RAM? If yes a possibility would be to use a tmpfs as a store for both S3 upload and preview generation.

mosi-kha commented 3 years ago

no, at the moment just get each chunk from the client web app then (after decoding from base64) pass to the s3 upload chunk. i haven't all chunks together in ram.

a523 commented 2 years ago

In its current state preview-generator cannot generate previews of streaming content as it relies on several packages/programs which do not necessarily provide a way to read streams. I understand that uploaded files are going through your web application. If this is the case you could temporarily save the stream contents on disk, generate the preview then discard the temporary file. Also do you buffer all the chunks in RAM? If yes a possibility would be to use a tmpfs as a store for both S3 upload and preview generation.

@grignards I have tested

ffmpeg -i http://172.16.7.1:9000/test/test.mp4 -ss 00:00:10 -vframes 1 -f image2 "image.jpg"

ffmped , it will work fine, as long as the url of this video can be accessed anonymously. so, if we just support video streaming , not all files , are we possible to complete? But how do I change my code?

grignards commented 2 years ago

As ffmpeg supports reading over http using preview generator classically is possible:

from preview_generator.manager import PreviewManager
m = PreviewManager("/tmp/")
preview_path = m.get_jpeg_preview("https://test-videos.co.uk/vids/bigbuckbunny/mp4/h264/1080/Big_Buck_Bunny_1080_10s_30MB.mp4")

works just fine. By default (page=-1) the preview will be done with the image at 2% of the video's duration so in worst case (if ffmpeg does not support skipping to frame) the downloaded part should be around 2% of the total video size.