Open jrmuizel opened 1 year ago
My bad, while I am working on 8x8 transforms at the moment they are enabled although they do not work yet. Big Buck Bunny uses these transforms so it won't decode at the moment (although normally it should return "unsupported"). I will create a release tag once I promote edge264, so that these kinds of temporary unstabilities don't happen in the future.
To benchmark edge264 at the moment you should use a video without 8x8 transforms, so anything with Main profile and progressive scan, or to reencode it with no-8x8dct
for x264.
This seems to mostly work now.
I did some quick benchmarking and got:
openh264: 9.1s libavcodec: 5s edge264: 4.6s
Here's a profile of edge264 in case you're interested: https://share.firefox.dev/3AXlVA2
Damn you're fast!
I have 6 conformance clips still failing so 8x8 support is not yet done but almost there. I tested locally vs ffmpeg too and got the same results, which was a bit disappointing considering the number of theoretical improvements. With another HD clip from the movie Monsters edge264 has the same performance as libavcodec :/ Still the gap with openh264 is very cool :) My tests so far have been on SD clips, so I suspect that libavcodec might have long initialization times which matter less with big frames, and edge264 might suffer from cache associativity conflicts related to frame strides (which I didn't investigate yet).
Thanks a lot for the profiler results! It shows that deblocking is a major performance hog (function finish_frame, 7.9%) which can be improved. For the rest I'll need cache misses. Is Firefox profiler something I can host locally?
Cheers, Thibault
I gathered the profiles using https://github.com/mstange/samply on macOS. It works locally if you're on macOS or Linux.
If you want assembly support in the profile you'll need to use these instructions: https://gist.github.com/mstange/6b2b3b15708cce847eacfabcf4a9f4cc But beware, the assembly support is not done yet so you may run into bugs
And libavc gets ~7s
time ./avcdec dec-single-thread.cfg
real 0m7.089s
user 0m6.860s
sys 0m0.221s
Well that is some more good news, thanks :) You might get some more speedup with GCC < 10. I need to free some disk space to try with all versions of GCC, but basically the older the better. edge264 now decodes Big Buck Bunny fine. If you plan on using it in any project please give me some feedback on most pressing features ! Cheers, Thibault
And libavc gets ~7s
time ./avcdec dec-single-thread.cfg real 0m7.089s user 0m6.860s sys 0m0.221s
Hi @jrmuizel, I am preparing a presentation on edge264 for FOSDEM and need to benchmark avcdec
on my machine (macOS Monterey, Intel Broadwell). I compiled the repo from https://android.googlesource.com/platform/external/libavc but then avcdec
keeps looping on Error in header decode 0x0
no matter what options or input files I set. Have you had the same issue, do you know of some documentation on the lib, and maybe can you share your dec-single-thread.cfg
file?
Cheers, Thibault
My dec-single-thread.cfg
looks like:
--input input.h264
--save_output 0
--num_frames -1
--output out.yuv
--chroma_format YUV_420P
--share_display_buf 0
--num_cores 1
--loopback 0
--display 0
--fps 59.94
input.h264 was made with: ffmpeg -i Big_Buck_Bunny_1080_10s_30MB.mp4 -vcodec copy -bsf h264_mp4toannexb -an input.264
Thanks! Actually I didn't notice that only Linux support is mentioned on the repo, not macOS. I'd like to present your bench values in an intro slide if you don't mind, to show that edge264 is fastest overall, before diving into programming techniques.
Yep, you can use my values. Also, avcdec
does work on macOS. That's where my numbers are from.
After converting to a h264 file I get a decoding error: