Closed zhuker closed 5 years ago
Do you mean one image per frame? That'll be thousands of virtual files. This may be possible although I am not sure if FUSE will support that (there is a size limit to the buffer filled in with the files), what's the idea behind that?
For a 120 second file you could get some 3500 images. I suppose that could be more than FUSE supports.
we have a video processing pipeline which accepts a bunch of png/jpg files what we do now:
we waste multiple GBs of storage (pngs need to stay on storage after processing to be able to get back to them)
what i want to achieve
our videos are mostly i-frame only (e.g. prores) so it shouldn't be too processor intensive
Now I understand. There are two problems, but maybe we can find a solution:
First it won't save you any disk space. MP4/MOV must be decoded sequentially, so if you want to access say frame number 1234 it would require to decode frame 1 to 1233 first. Of course it is possible to seek to an arbitrary position, at least approximately, and start decoding from there. That's required anyway to have at least one i-frame to generate a full picture. But I am not sure if I could locate that image number 1234 that way. There would be some inaccuracy. So mostly it would mean that the whole file needs to be decoded and stored, either on disk or in memory, waiting for someone to access frame images.
Second problem is, as we do not have a single file that grows, how can you make sure that your neuronal net does not hiccup when it tries to access pngs that are not yet ready? Currently, if an app tries to access a part of the file that has not been decoded the read blocks until that chunk is ready. This could be done with file opens as well, but that means that the accessing software may have to wait even for minutes to be able to get the image, depending on the size of the video...
A solution for problem 1 could be to disable the cache but that means once an image was read it will be discarded. Accessing it again means decoding it again. That's a processing time hog.
Problem 2 depends on your software. If it tolerates that it would work.
So some questions:
Will your software read the pngs sequentially and only once? If so, it could be done. Images must still be cached but only until read and then could be discarded.
If it tries to read faster than the images can be made available, will it accept read blocks, even if for several seconds? This will happen especially if it skips images, for example, only reads an image every 5 seconds or so.
Possible trouble:
I'll have to try how FUSE handles 1000+ virtual files. Maybe it cannot handle that and then it'll be a no-go :(
Suggestion:
The cache and discard after reading function would require some significant changes, but I could implement the video-to-image thing with normal caching first so you could try out if it works for you at all.
first of all - thanks for giving it a thought and for your time
First it won't save you any disk space. MP4/MOV must be decoded sequentially, so if you want to access say frame number 1234 it would require to decode frame 1 to 1233 first.
In general case - you are right. But that's not really a problem for us, since we are mostly using I-frame only video formats (e.g. ProRes+mov and JPEG2000+mxf) which are easily seekable.
But I am not sure if I could locate that image number 1234 that way.
again - in general case you are right. but for our combination of codec+container frames are quite easy to locate
Second problem is, as we do not have a single file that grows, how can you make sure that your neuronal net does not hiccup when it tries to access pngs that are not yet ready?
That's fine, it'll block and wait until file is ready
wait even for minutes to be able to get the image, depending on the size of the video...
waiting for minutes isn't great but it's way better than the several hours we wait before we start processing today
A solution for problem 1 could be to disable the cache but that means once an image was read it will be discarded. Accessing it again means decoding it again. That's a processing time hog.
our videos are normally 4K HDR videos (16bit per pixel) so 20MB per frame file size is not rare, i would very much like to keep caching even if it means storing some of the pngs on actual disk (some LRU policy?)
Will your software read the pngs sequentially and only once?
the processing pipeline is sequential, but when it fails we want to analyze why something happened and then out of this 100k images we want to randomly take a look at 100-200 hundred. Some kind of cache is still needed, me thinks.
If it tries to read faster than the images can be made available, will it accept read blocks, even if for several seconds? This will happen especially if it skips images, for example, only reads an image every 5 seconds or so.
we do process every single frame (no skips), but blocking reads are totally cool (even for seconds), we dont need to be realtime or anything like that
I'll have to try how FUSE handles 1000+ virtual files. Maybe it cannot handle that and then it'll be a no-go :(
please let me know.
i have no idea how FUSE works internally but shouldn't it be
readdir
-s to list entries open
or stat
specific file let's assume the following:
ffmpegfs
will check if video is of supported type and refuse to mount otherwisefor readdir
you could keep some kind of simple cache
for open
you quickly seek (because we only support seekable video) and decode file
i have a problem understanding how stat
would work. but for simplicity sake let's say images are going to be .bmp
(or any other uncompressed format) not .png
so we know ahead of time what the file size would be
can you educate me why given these assumptions you would need to take care of 1000+ fuse virtual files?
The cache and discard after reading function would require some significant changes, but I could implement the video-to-image thing with normal caching first so you could try out if it works for you at all.
that would be AMAZING
let's assume the following:
- we only support easily seekable formats. on mount ffmpegfs will check if video is of supported type and refuse to mount otherwise
- fs will be mounted read-only
for readdir you could keep some kind of simple cache
for open you quickly seek (because we only support seekable video) and decode file
i have a problem understanding how stat would work. but for simplicity sake let's say images are going to be .bmp (or any other uncompressed format) not .png so we know ahead of time what the file size would be
Given all these assumptions it should be feasible. For stat there may be a problem with formats like png/jpg, because I can not provide a predicted size until the image is actually ready. So the size will definitely be wrong until the image is available, unless I use a fixed size format like BMP which will be a major waste of disk space :)
Your software that consumes the files must cope with files that change size after opening. They may be smaller or larger.
This is probably a big issue when you access the files via Samba or NFS. Samba fills files smaller than predicted with zeros to this size, or cuts them to the size (discards the end) if it is larger. NFS also fills smaller files with zeros, larger files will either be cut or correct (this depends, you can try it as many times as you want to, the outcome is not deterministic).
This is why I add 2.5% to the size prediction to raise the possibility of the file ending up a bit smaller to avoid the cut off... So we could set this to 5 or more per cent to avoid lost parts. On the other hand this would only mean that a very small portion at the bottom of the image is missing and probably nothing important.
can you educate me why given these assumptions you would need to take care of 1000+ fuse virtual files?
Once someone does a ls on a virtual directory, FUSE issues a readdir to me and I have to present the whole directory. So for a 10.000 frames video I would have to present 10.000 virtual images. There seem to be ways to do that with FUSE but I need to try out what happens...
PS:
I am currently preparing release 1.7 (and actually 1.8 which only adds Doxygen documentation), therefore I currently have a feature freeze. I cannot add new features at the moment but I would create a push request, implement your features and then merge it back later. The functionality sounds interesting, I would like to have it in FFmpegfs. I also don't know how long it will take to realise it, but probably a week or two.
Given all these assumptions it should be feasible. For stat there may be a problem with formats like png/jpg, because I can not provide a predicted size until the image is actually ready. So the size will definitely be wrong until the image is available, unless I use a fixed size format like BMP which will be a major waste of disk space :)
format like BMP which will be a major waste of disk space :)
you mean for cache? i am not copying the files anywhere, just reading them from ffmpegfs
Your software that consumes the files must cope with files that change size after opening. They may be smaller or larger.
i vote for BMP or any other uncompressed format (e.g. PGM, PNM, PPM, uncompressed TIF) so we know the size ahead of time. or at least there should be an option (compressed/uncompressed)
we could set this to 5 or more per cent to avoid lost parts
thats cool
very small portion at the bottom of the image is missing and probably nothing important.
neural nets normally give unpredictable results given damaged/incomplete input, so it's important to preserve the entire image
FUSE issues a readdir to me and I have to present the whole directory
i see!
there's no incremental readdir
in FUSE?
The functionality sounds interesting, I would like to have it in FFmpegfs.
Yeah that would be really cool.
I also don't know how long it will take to realise it, but probably a week or two.
I wish I could help you, but my knowledge of C++ is really bad. I'll be able to fix minor bugs and make some tweaks.
format like BMP which will be a major waste of disk space :)
you mean for cache? i am not copying the files anywhere, just reading them from
ffmpegfs
For cache. FFmpegs needs to store the images somewhere, memory or disk, at least for as long as they have to be available. Redecoding them over every time someone wants to read them is far too time consuming.
i vote for BMP or any other uncompressed format (e.g. PGM, PNM, PPM, uncompressed TIF) so we know the size ahead of time. or at least there should be an option (compressed/uncompressed)
I'll create PGM, PNM, BMP, or whatever you prefer, this is a simple codec setting, ffmpeg API does that for me. Just choose your favourite weapon :)
very small portion at the bottom of the image is missing and probably nothing important.
neural nets normally give unpredictable results given damaged/incomplete input, so it's important to preserve the entire image
OK, the we need to go for a predictable format. We can then try PNG later or whatever and see what happens.
there's no incremental
readdir
in FUSE?
No, that's the design of the file systems that access the FUSE drive. You are required to provide the complete directory. Even if you do "ls notEXISTING" the command still needs to examine all files.
sounds good
just found out -compression_level 0 -f image2 '%06d.png'
gives predictable size pngs but of course uncompressed
Work in progress...
The desired functionality has been implemented now, at least partly. There are a few things missing but the current state should be sufficient as a proof of concept.
Caution: The code is by no means yet production grade, it needs extensive overhaul yet, but it works.
The functionality that is missing or incomplete:
For 1.: A simple
ls /mnt/ffmpegfs/video1.mp4/00001.png
will fail. You need to list the parent directory first like:
ffmpegfs /storage/videos /mnt/ffmpegfs
find /mnt/ffmpegfs
/mnt/ffmpegfs/video1.mp4/00001.png
/mnt/ffmpegfs/video1.mp4/00002.png
...
/mnt/ffmpegfs/video1.mov/00001.png
/mnt/ffmpegfs/video1.mov/00002.png
This builds the frame image virtual directories. Once this was done, the "ls" command will succeed. I guess that won't be a problem as the software that consumes the frame images needs to list the directories in the first place anyway to find out what to scan...
For 2.: For each image a 2 MB segment is reserved in cache. That makes it easy to access e.g. image number 1234 - it is simply at 1234 * 2 MB. But that's a big waste of disk space and limits image size to 2 MB. For 3.: Do a "rm -Rf /path/to/cache" before running ffmpegfs. For 4.: i-frames can be completed with information of previous frames. There are filters to do that, they just need to be activated inside the code.
If the functionality proves useful, these ristrictions will be removed later.
Special considerations:
Creating a virtual set of images won't save you any disk space. The frame images have to be stored somewhere. So you need to set the cache path to a drive with enough space (--cachepath=DIR). Also set a cache limit that is sufficent or remove it completely (--max_cache_size=0). You may set cache expiry time to 1 day (--expiry_time=1d) or whatever appropriate for you so old files get cleaned up once no longer required.
The biggest advantage of using a virtual file set is the ease of use - no need to convert the video to frame images first. Just access them when required and they can be used as soon as they are ready.
To get the right code, change the branch to FBissue#26 and download the ZIP or if you "git clone" it don't forget to do "git checkout FBissue#26" like I did once... :)
Thanks a lot! I will give a try now. Quick question: why 2MB limit? our frames are normally 1920x1080 which is already over 2MB uncompressed
Doesn't mount my folder with videos see my shell history attached
I am on FB_issue_#26
branch
zhukov@kote:~/ffmpegfs$ git branch
* FB_issue_#26
master
Thanks a lot! I will give a try now. Quick question: why 2MB limit? our frames are normally 1920x1080 which is already over 2MB uncompressed If you use JPG or PNG it should work. If this is not enough you can change fileio.h line 60:
#define IMAGE_MAX_SIZE (2*1024*1024)
You could set 5 or more MB. But I guess for PNG/JPG that should be sufficient.
Thanks a lot! I will give a try now. Quick question: why 2MB limit? our frames are normally 1920x1080 which is already over 2MB uncompressed If you use JPG or PNG it should work. If this is not enough you can change fileio.h line 60:
#define IMAGE_MAX_SIZE (2*1024*1024)
You could set 5 or more MB. But I guess for PNG/JPG that should be sufficient.
Our frames converted to PNG are normally 10-20MB each
Doesn't mount my folder with videos see my shell history attached
Did you set --desttype=png or --desttype=jpg? If you set the target to anything else than png/jpg/bmp, you will get audio and video files.
oh my god!!! it works!!!
Too early to celebrate :(
it does list frames correctly
but here is what i get if i want to ffmpeg -i
on a PNG
Thanks a lot! I will give a try now. Quick question: why 2MB limit? our frames are normally 1920x1080 which is already over 2MB uncompressed If you use JPG or PNG it should work. If this is not enough you can change fileio.h line 60:
#define IMAGE_MAX_SIZE (2*1024*1024)
You could set 5 or more MB. But I guess for PNG/JPG that should be sufficient.Our frames converted to PNG are normally 10-20MB each
Whoopsy! Well in that case you cold set
#define IMAGE_MAX_SIZE (25*1024*1024)
but that would eat up a lot of disk space. I guess with the compression rate I've set 2 MB should be enough. I'll rework the cache format when everything else works.
#define IMAGE_MAX_SIZE (25*1024*1024)
just set to 20MB
Too early to celebrate :( it does list frames correctly but here is what i get if i want to
ffmpeg -i
on a PNG
That's what I was afraid of... The pixel format of your source video is not supported by PNG. Maybe JPG works... If not one of the next things I wanted to do is convert the pixel format if required. This is already done for video to video conversion.
Maybe you can use JPG or you can find a video that has a different pixel format.
If not I am sorry you'll have to be patient and wait a few days until I have completed the pixel format conversion.
well png
is always rgb24
or rgb48
and most videos are yuv420
or yuv422p10
pixel format conversion should be there
OK. Sorry. Then please be patient I'll add the format conversion.
i am not complaining it's already a miracle :)
desttype=jpg
works!
but even for i-frame only videos it takes FOREVER to open the last frame
desttype=jpg
works!
Great. But the pixel format conversion is still required. I also have many videos that won't work. As a nice little extra I can also apply deinterlacing and complete p-frames. So this is a must have.
but even for i-frame only videos it takes FOREVER to open the last frame
The code is far from being optimal, especially the caching code is crap. I know. I wanted to prove if this is feasible in the first place, seems it is, so it's worth digging deeper now. There is much room for optimisations.
Maybe you can check if your video processing software likes the results, especially these locked files until available and the size "morphing". I can speed up processing but not that the files will be there in an instant. Also I will never be able provide the exact file size from the start.
If you video software basically copes with that it's worth walk the extra mile from here.
i am not complaining it's already a miracle :)
I know :)
here is a sample 10bit yuv422p10
prores https://kote.videogorillas.com/vmir/vdms/orig.mov
when i mount it with desttype=jpg
it produces this:
i assume its because of lack of yuv422p10
to yuvj420
conversion
when i mount yuv420p
h264 https://kote.videogorillas.com/vmir/vdms/orig.mp4
it produces correct jpg
but all that aside - ITS AMAZING!!! 🥇 👍 🏆
Is that a scene from "Animal House"?
Well, as I said, I'll implement the pixel conversion. This is going to work with all formats, with a small degradation in quality, but I guess JPG or PNG compression will do more harm.
More to come... :)
its a scene from "Dawson Creek" old time tv show
what are you using (editor/debugger/ide) to write code?
I use QtCreator, with a few tricks I can write code and debug, of course without Qt. I even use QtCreator to write (non-Qt) code for ARM using a cross compiler.
make_file(buf, filler, VIRTUALTYPE_FRAME, origpath, filename, 40 * 1024, virtualfile->m_st.st_ctime);
this is where 40KB size comes from!
why not 42KB then :)
this is where 40KB size comes from! why not 42KB then :)
Well, 42 is the answer to all questions.
You may give it another try. I have implemented pixel format conversion, so your ProRes videos should do as source. Actually you have the full program: pixel format conversion, deinterlacing and rescaling.
Currently jpg works only, png and bmp (RGB pixel formats) yield strange results. It seems I have a lack of understanding how RGB pixel formats work. I try to figure out how.
I also changed the max. image size to 20 MB.
There's still a lot to do, but maybe now you can evaluate the result so far.
Found this, that's RGB to YUV though... https://stackoverflow.com/questions/21938674/ffmpeg-rgb-to-yuv-conversion-loses-color-and-scale
But the images he gets look the same as my results.
It seems that the deinterlace filter does not work. Simply do not use it, and it will work. I'll try to find out what the problem is. Seems that I am using it wrong. It should work on RGB frames.
Thanks. I’ll give it a try. Not sure I’ll be able to test before Monday. But if I do have a chance, I’ll report back. Are you using swscale to do pixel format conversions? Deinterlace is not needed in our processing , we do our own. https://github.com/A-Bush/Deep-Video-Deinterlacing/blob/master/README.md
Sent from my iPhone
On Mar 23, 2019, at 12:30 AM, Norbert Schlia notifications@github.com wrote:
It seems that the deinterlace filter does not work. Simply do not use it, and it will work. I'll try to find out what the problem is. Seems that I am using it wrong. I should work on RGB frames.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
Found the problem. Deinterlacing now also works! You can check if it works for you, next I'll go into optimisations:
Thanks. I’ll give it a try. Not sure I’ll be able to test before Monday. But if I do have a chance, I’ll report back. Are you using swscale to do pixel format conversions?
Pixel format conversion is done with sws_scale. If the image size is not limited by command line the function merely does the pix_fmt conversion.
If you don't need deinterlace, simply do not use the --deinterlace parameter and the filter will not be called at all.
So I'll wait for the results now. If I find some time maybe I'll get into the list of optimisations.
Deinterlace is not needed in our processing , we do our own. https://github.com/A-Bush/Deep-Video-Deinterlacing/blob/master/README.md
I see. For simplicity I currently only use yadif, but FFmpeg API supports many algorithms. nnedi for example also uses a neural network. Or there's one that can use CUDA for speed up.
Maybe one of those could save you one processing step. If you could use one just open an issue for it.
Deinterlace the input video ("bwdif" stands for "Bob Weaver Deinterlacing Filter"). https://ffmpeg.org/ffmpeg-filters.html#toc-bwdif Deinterlace input video by applying Donald Graft’s adaptive kernel deinterling. Work on interlaced parts of a video to produce progressive frames. https://ffmpeg.org/ffmpeg-filters.html#toc-kerndeint Apply motion-compensation deinterlacing. https://ffmpeg.org/ffmpeg-filters.html#toc-mcdeint Deinterlace video using neural network edge directed interpolation. https://ffmpeg.org/ffmpeg-filters.html#toc-nnedi Deinterlace the input video ("w3fdif" stands for "Weston 3 Field Deinterlacing Filter"). https://ffmpeg.org/ffmpeg-filters.html#toc-w3fdif Deinterlace the input video ("yadif" means "yet another deinterlacing filter"). https://ffmpeg.org/ffmpeg-filters.html#toc-yadif-1 Deinterlace the input video using the yadif algorithm, but implemented in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec and/or nvenc. https://ffmpeg.org/ffmpeg-filters.html#toc-yadif_005fcuda
ok finally got to test it, sorry it took me so long it works! 👍
prores -> png ok but should be rgb48 (not rgb24) because prores has 10bit color prores -> jpg ok but weird result when viewed on mac (see below) h264 -> png ok h264 -> jpg same weird result on mac (see below)
prores->png
prores->jpg
when viewed on Mac
h264(yuv420p)->png h264(yuv420p)->jpg
when viewed on Mac
Thanks for testing the new version!
ok finally got to test it, sorry it took me so long it works! +1
Hooray :)
prores -> png ok but should be rgb48 (not rgb24) because prores has 10bit color FFmpeg only seems to have rgb48be (rgb48 big endian format). I could give it a try. I'll implement a selection of the best match format (using av_find_best_pix_fmt_of_2 or so). But that's not on top of the list, there are a few other things more important to be completed first.
prores -> jpg ok but weird result when viewed on mac (see below) h264 -> png ok h264 -> jpg same weird result on mac (see below)
It should not make a difference which source file format you use (precisely which codec, H264, ProRes or whatever) as the pixel format is converted to what PNG, JPG or BMP requires. So it's no wonder the weird results come up with every source.
Have you opened the images directly from the ffmpegfs virtual directory? What happens if you local copy the images on your mac and then open that copy? What happens if you open the virtual image a second time? Still distorted? If so the mac viewer does not like the images I create...
What about BMP?
On Ubuntu the stock image viewer (Gnome Desktop) showed only a small stripe of the image top until I added the functionality that updates the file size once it is known. You can see that when you open an image in KDE Dolphin or Windows Explorer, the file size gets refreshed (from 40 KB to 6 MB or so). In Dolphin you may have to refresh (F5), Explorer does that automatically.
Anyway, the only tweak compared to a real image in a real directory is the file size: ffmpegfs pretends all images to be 40 KB until it has been opened (e.g. with an image viewer) and decoded. Then the file size magically changes to the real size.
Anyway, there is still a lot to do. I am glad that ffmpegfs basically works for you. The list so far:
I will look into that this weekend.
when viewing on mac i copied from the virtual directory first by just using cp 000000001.png ~
one more thing i am missing is ability to read random frames without waiting for decode of previous ones. this would be very handy for debugging.
also find /mnt/ffmpegfs
is not feasible on our 100TB video storage
would be nice to just be able to:
cp /mnt/ffmpegfs/long/path/to/video/here/00000001.png ~
what would it take to implement it?
like so:
# ls /storage/videos
# ffmpegfs /storage/videos /mnt/ffmpegfs
# find /mnt/ffmpegfs