Closed hbiyik closed 1 year ago
Is libyuv really necessary?
Yeah, i found out libyuv can actually meet the performance of rga on 3588.
Regarding memory leaks:
[aac @ 0x10435100] Qavg: 1000.753
==41579== at 0xBEDAEC: VALGRIND_PRINTF_BACKTRACE (valgrind.h:6815)
==41579== by 0xBEDE3B: av_log_default_callback (log.c:399)
==41579== by 0xBEDFB3: av_vlog (log.c:434)
==41579== by 0xBEE047: av_log (log.c:413)
==41579== by 0x255113: term_exit (ffmpeg.c:326)
==41579== by 0x2553C7: ffmpeg_cleanup (ffmpeg.c:582)
==41579== by 0x24B25F: exit_program (cmdutils.c:102)
==41579== by 0x259EC7: main (ffmpeg.c:4194)
==41579==
==41579== HEAP SUMMARY:
==41579== in use at exit: 68,220 bytes in 888 blocks
==41579== total heap usage: 44,542 allocs, 43,654 frees, 83,517,297 bytes allocated
==41579==
==41579== 8 bytes in 1 blocks are still reachable in loss record 1 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x91C18CB: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 16 bytes in 1 blocks are still reachable in loss record 2 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x91B975F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 16 bytes in 1 blocks are still reachable in loss record 3 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x90C5B0F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 34 bytes in 1 blocks are still reachable in loss record 4 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x780231F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579== by 0x7469383: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 38 bytes in 1 blocks are still reachable in loss record 5 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x746688B: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 39 bytes in 1 blocks are still reachable in loss record 6 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x745A013: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 42 bytes in 1 blocks are still reachable in loss record 7 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x746E4B7: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 44 bytes in 1 blocks are still reachable in loss record 8 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x780231F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579== by 0x7468E57: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 45 bytes in 1 blocks are still reachable in loss record 9 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x780231F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579== by 0x747717B: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 48 bytes in 1 blocks are still reachable in loss record 10 of 38
==41579== at 0x4865058: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x91B9987: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 48 bytes in 1 blocks are still reachable in loss record 11 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x745A227: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 48 bytes in 1 blocks are still reachable in loss record 12 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x7466A9F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 48 bytes in 1 blocks are still reachable in loss record 13 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x780231F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579== by 0x746923F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 48 bytes in 1 blocks are still reachable in loss record 14 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x746E967: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 56 bytes in 1 blocks are still reachable in loss record 15 of 38
==41579== at 0x4865058: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x91BA73F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 60 bytes in 1 blocks are still reachable in loss record 16 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x780231F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579== by 0x746D747: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 63 bytes in 1 blocks are still reachable in loss record 17 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x780231F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579== by 0x7468E67: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 64 bytes in 1 blocks are still reachable in loss record 18 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x780231F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579== by 0x746938F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 64 bytes in 1 blocks are still reachable in loss record 19 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x746E4CF: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 67 bytes in 1 blocks are still reachable in loss record 20 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x74668A3: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 68 bytes in 1 blocks are still reachable in loss record 21 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x745A02B: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 69 bytes in 2 blocks are still reachable in loss record 22 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x780231F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579== by 0x8B99BC7: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 71 bytes in 1 blocks are still reachable in loss record 23 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x780231F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579== by 0x746924F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 78 bytes in 1 blocks are still reachable in loss record 24 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x780231F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579== by 0x746D757: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 80 bytes in 1 blocks are still reachable in loss record 25 of 38
==41579== at 0x4866058: operator new(unsigned long, std::nothrow_t const&) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x741A0B3: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 80 bytes in 1 blocks are still reachable in loss record 26 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x780231F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579== by 0x7477187: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 104 bytes in 1 blocks are still reachable in loss record 27 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x91D8A67: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 120 bytes in 1 blocks are still reachable in loss record 28 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x90C556B: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 128 bytes in 1 blocks are still reachable in loss record 29 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x77FEA47: void std::vector<std::string, std::allocator<std::string> >::_M_emplace_back_aux<std::string const&>(std::string const&) (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579== by 0x746EB7F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 148 bytes in 2 blocks are still reachable in loss record 30 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x6B0D3CB: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FAEF: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x6B0FBA3: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30)
==41579== by 0x780231F: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579== by 0x8B99BD3: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 204 bytes in 1 blocks are still reachable in loss record 31 of 38
==41579== at 0x4869F34: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x91BF003: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 328 bytes in 1 blocks are still reachable in loss record 32 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x91B3AF7: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 336 bytes in 2 blocks are still reachable in loss record 33 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x91AE507: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 336 bytes in 7 blocks are still reachable in loss record 34 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x8B2FA8B: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 3,072 bytes in 1 blocks are still reachable in loss record 35 of 38
==41579== at 0x48657B8: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x745A5C7: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 4,640 bytes in 1 blocks are definitely lost in loss record 36 of 38
==41579== at 0x486A304: memalign (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x486A477: posix_memalign (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x59A342F: os_malloc (in /usr/lib/aarch64-linux-gnu/librockchip_mpp.so.0)
==41579== by 0x59A34EF: mpp_osal_malloc (in /usr/lib/aarch64-linux-gnu/librockchip_mpp.so.0)
==41579== by 0x59A38B3: mpp_osal_calloc (in /usr/lib/aarch64-linux-gnu/librockchip_mpp.so.0)
==41579== by 0x58A341B: mpp_enc_cfg_init (in /usr/lib/aarch64-linux-gnu/librockchip_mpp.so.0)
==41579== by 0x83F2D7: rkmpp_init_encoder (rkmppenc.c:35)
==41579== by 0xB287AF: rkmpp_init_codec (rkmpp.c:203)
==41579== by 0x5ED733: avcodec_open2 (avcodec.c:322)
==41579== by 0x256077: init_output_stream (ffmpeg.c:3233)
==41579== by 0x2564C3: init_output_stream_wrapper (ffmpeg.c:739)
==41579== by 0x2575DF: do_video_out (ffmpeg.c:1265)
==41579==
==41579== 24,588 bytes in 1 blocks are still reachable in loss record 37 of 38
==41579== at 0x4869F34: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x91BF3FF: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== 32,874 bytes in 842 blocks are still reachable in loss record 38 of 38
==41579== at 0x4865058: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==41579== by 0x91B31FF: ??? (in /usr/lib/aarch64-linux-gnu/libmali.so.1.9.0)
==41579==
==41579== LEAK SUMMARY:
==41579== definitely lost: 4,640 bytes in 1 blocks
==41579== indirectly lost: 0 bytes in 0 blocks
==41579== possibly lost: 0 bytes in 0 blocks
==41579== still reachable: 63,580 bytes in 887 blocks
==41579== of which reachable via heuristic:
==41579== stdstring : 1,122 bytes in 20 blocks
==41579== suppressed: 0 bytes in 0 blocks
==41579==
==41579== For lists of detected and suppressed errors, rerun with: -s
==41579== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
For the sake of comparison, screen-cast with ffmpeg (encoder branch) and gstreamer (JeffyCN):
ffmpeg (CPU 259.1%):
./FFmpeg-encoder/ffmpeg -device /dev/dri/card0 -framerate 30 -f kmsgrab -i - -vf 'hwdownload,format=bgr0' -c:v hevc_rkmpp_encoder -b:v 2000K screencast_30fps_4.mkv
gstreamer (CPU 49.9%):
GST_VIDEO_CONVERT_USE_RGA=1 gst-launch-1.0 -e kmssrc ! videoconvert ! video/x-raw,width=1920,height=1080,format=NV12,framerate=30/1 ! mpph265enc bps=2000000 bps-max=3000000 rc-mode=vbr ! h265parse ! matroskamux name=mux ! filesink location=screencast_30fps_4_gstreamer.mkv
@avafinger how do you measure the CPU consumption (exactly which command)? I have something prototype and i want to compare with your results. Are you also on a 3588?
Here is the latest with commit 84466c4cb94627a4b1acd7a601e2180e63d716a1
Currently onyl libyuv conversion for RGB is working so there is some room for improvement when i make this stupid RGA work.
There seems to be some alignment issue somewhere and it does not work.
2nd: i am not sure how hwdownload filter works, if it creates a copy then there are some performance left on the table there as well due to copy. If this is the case next step would be implement DRMPrime pixfmt in the encoder and directly load the kmsgrab framebuffers to Mpp without any hwdownload. But thats the second step.
First i ahve make this stupid RGA work.
ah nevermind, please ignore all, i found a way to bypass any type of conversion for RGB. So the best rga is the one which is not used.
I will ping when things gets matured, i think there is enough for me to deal with atm :) thanks for your help.
@avafinger could you provide me some reference cpu usage if you could with gstreamer, ffmpeg-rk?
I have utilized 0-copy when encoding and 1920x1080 @60 fps realtime encoding gives around %10~%15 Cpu load.
The code is all over the place at the moment, but i want to checkpoint to see where i am.
Measurement command while ffmpeg is running
watch -n0.5 ps -C ffmpeg -o %cpu,%mem,cmd
Output:
%CPU %MEM CMD
12.7 0.8 ffmpeg -loglevel debug -device /dev/dri/card0 -framerate 60 -f kmsgrab -i - -c:v h264_rkmpp_encoder -b:v 20M -y screencast_60fps.mp4
@hbiyik ,
I use overall %cpu usage (peak) for the process from htop and top which is a little bit more. Using your command i get 250% CPU usage for the ffmpeg process.
I can't run your command on my setup, i think i have RGB and not NV12 on my monitor 1920x1080. Looks like you are on 4K display. So i have to convert RGB to NV12, which seems to be using sw. If you have a spare 1080p display around, try with that. I don't have a 4K where i am with the board, but i can test it on the weekend.
FFmpeg-encoder/ffmpeg -loglevel debug -device /dev/dri/card0 -framerate 30 -f kmsgrab -i - -c:v h264_rkmpp_encoder -b:v 20M -y screencast_30fps.mp4
ffmpeg version 6.0 Copyright (c) 2000-2023 the FFmpeg developers
built with gcc 11 (Ubuntu 11.3.0-1ubuntu1~22.04.1)
configuration: --prefix=/usr --disable-libopenh264 --disable-vaapi --disable-vdpau --disable-decoder=h264_v4l2m2m --disable-decoder=vp8_v4l2m2m --disable-decoder=mpeg2_v4l2m2m --disable-decoder=mpeg4_v4l2m2m --disable-libxvid --disable-libx264 --disable-libx265 --enable-rkmpp --enable-nonfree --enable-gpl --enable-version3 --enable-libmp3lame --enable-libpulse --enable-libv4l2 --enable-libdrm --enable-libxml2 --enable-librtmp --enable-libfreetype --enable-openssl --enable-opengl --enable-libopus --enable-libvorbis --disable-shared --enable-decoder='aac,ac3,flac' --extra-cflags=-I/usr/src/linux-headers-5.10.110-rk3588/include --disable-cuvid --disable-stripping --disable-optimizations --extra-cflags=-Og --extra-cflags=-fno-omit-frame-pointer --enable-debug=3 --extra-cflags=-fno-inline
libavutil 58. 2.100 / 58. 2.100
libavcodec 60. 3.100 / 60. 3.100
libavformat 60. 3.100 / 60. 3.100
libavdevice 60. 1.100 / 60. 1.100
libavfilter 9. 3.100 / 9. 3.100
libswscale 7. 1.100 / 7. 1.100
libswresample 4. 10.100 / 4. 10.100
libpostproc 57. 1.100 / 57. 1.100
Splitting the commandline.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument 'debug'.
Reading option '-device' ... matched as AVOption 'device' with argument '/dev/dri/card0'.
Reading option '-framerate' ... matched as AVOption 'framerate' with argument '30'.
Reading option '-f' ... matched as option 'f' (force format) with argument 'kmsgrab'.
Reading option '-i' ... matched as input url with argument '-'.
Reading option '-c:v' ... matched as option 'c' (codec name) with argument 'h264_rkmpp_encoder'.
Reading option '-b:v' ... matched as option 'b' (video bitrate (please use -b:v)) with argument '20M'.
Reading option '-y' ... matched as option 'y' (overwrite output files) with argument '1'.
Reading option 'screencast_30fps.mp4' ... matched as output url.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option loglevel (set logging level) with argument debug.
Applying option y (overwrite output files) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url -.
Applying option f (force format) with argument kmsgrab.
Successfully parsed a group of options.
Opening an input file: -.
[AVHWDeviceContext @ 0x55a75e0760] Opened DRM device /dev/dri/card0: driver rockchip version 3.0.0.
[kmsgrab @ 0x55a75e0390] Plane 54: CRTC 68 FB 232.
[kmsgrab @ 0x55a75e0390] Using plane 54 to locate framebuffers.
[kmsgrab @ 0x55a75e0390] Template framebuffer is 232: 1920x1080 format 34325258 modifier 0 flags 2.
[kmsgrab @ 0x55a75e0390] Format is bgr0, from DRM format 34325258 modifier 0.
[kmsgrab @ 0x55a75e0390] All info found
[kmsgrab @ 0x55a75e0390] rfps: 29.083333 0.017014
[kmsgrab @ 0x55a75e0390] rfps: 29.166667 0.013063
[kmsgrab @ 0x55a75e0390] rfps: 29.250000 0.009633
Last message repeated 1 times
[kmsgrab @ 0x55a75e0390] rfps: 29.333333 0.006725
Last message repeated 1 times
[kmsgrab @ 0x55a75e0390] rfps: 29.416667 0.004339
[kmsgrab @ 0x55a75e0390] rfps: 29.500000 0.002474
[kmsgrab @ 0x55a75e0390] rfps: 29.583333 0.001131
Last message repeated 1 times
[kmsgrab @ 0x55a75e0390] rfps: 29.666667 0.000309
Last message repeated 1 times
[kmsgrab @ 0x55a75e0390] rfps: 29.750000 0.000009
Last message repeated 1 times
[kmsgrab @ 0x55a75e0390] rfps: 29.833333 0.000230
Last message repeated 1 times
[kmsgrab @ 0x55a75e0390] rfps: 29.916667 0.000973
[kmsgrab @ 0x55a75e0390] rfps: 30.000000 0.002238
[kmsgrab @ 0x55a75e0390] rfps: 59.000000 0.009895
[kmsgrab @ 0x55a75e0390] rfps: 60.000000 0.008951
[kmsgrab @ 0x55a75e0390] rfps: 29.970030 0.001723
[kmsgrab @ 0x55a75e0390] rfps: 59.940060 0.006891
Input #0, kmsgrab, from 'fd:':
Duration: N/A, start: 1686748342.595684, bitrate: N/A
Stream #0:0, 21, 1/1000000: Video: wrapped_avframe, 1 reference frame, drm_prime, 1920x1080, 0/1, 29.75 tbr, 1000k tbn
Successfully opened the file.
Parsing a group of options: output url screencast_30fps.mp4.
Applying option c:v (codec name) with argument h264_rkmpp_encoder.
Applying option b:v (video bitrate (please use -b:v)) with argument 20M.
Successfully parsed a group of options.
Opening an output file: screencast_30fps.mp4.
[file @ 0x55a75e4150] Setting default whitelist 'file,crypto,data'
Successfully opened the file.
Stream mapping:
Stream #0:0 -> #0:0 (wrapped_avframe (native) -> h264 (h264_rkmpp_encoder))
[vost#0:0/h264_rkmpp_encoder @ 0x55a75e3610] cur_dts is invalid [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
detected 8 logical cores
[graph 0 input from stream 0:0 @ 0x55a75ebf70] Setting 'video_size' to value '1920x1080'
[graph 0 input from stream 0:0 @ 0x55a75ebf70] Setting 'pix_fmt' to value '179'
[graph 0 input from stream 0:0 @ 0x55a75ebf70] Setting 'time_base' to value '1/1000000'
[graph 0 input from stream 0:0 @ 0x55a75ebf70] Setting 'pixel_aspect' to value '0/1'
[graph 0 input from stream 0:0 @ 0x55a75ebf70] Setting 'frame_rate' to value '119/4'
[graph 0 input from stream 0:0 @ 0x55a75ebf70] w:1920 h:1080 pixfmt:drm_prime tb:1/1000000 fr:119/4 sar:0/1
[format @ 0x55a75ec3e0] Setting 'pix_fmts' to value 'nv12'
[auto_scale_0 @ 0x55a75ed2e0] w:iw h:ih flags:'' interl:0
[format @ 0x55a75ec3e0] auto-inserting filter 'auto_scale_0' between the filter 'Parsed_null_0' and the filter 'format'
Impossible to convert between the formats supported by the filter 'Parsed_null_0' and the filter 'auto_scale_0'
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented
Error while processing the decoded data for stream #0:0
[AVIOContext @ 0x55a75e41d0] Statistics: 0 bytes written, 0 seeks, 0 writeouts
Terminating demuxer thread 0
Conversion failed!
Note: I am not using your latest code!
which makes sense, my output was on the local git commits where RGB + DRMPRIME is implemented in the encoder.
And you are right
1) in your version hwdownload
first copies from the framebufffer XGBR to AVFrame
2) The it software converts RGBXGBR to NV12.
3) Then NV12 Avframe is copied back Mppframe to be put in the encoder buffer
so +2 additional copies and 1 SW conversion really upsets the resources.
In my case, there is not hwdownload because kmsgrab already outputs DRMPrime frames, and i can directly import them to MPP encoder input buffer, so no copies or conversion neither hiw hw rga or sw.
I have not yet pushed those changes yet because they are too immature, but thanks for the tests, the direction seems to be correct. %12 load vs %250
No need for further testing with this i guess, i will refactor this thing several times more anyways.
But i would wonder how much cpu gstreamer or ffmpeg-rk would consume for comparison. I think gstreamer would be high but ffmpeg-rk was doing something similar to what i have done with 1 addional copy if havent misunderstood its code.
Here is the first preview: with the below commit you can 0-copy screen cast.
All i have left is to work on transcoding. Otherthan that below should be fine
https://github.com/hbiyik/FFmpeg/commit/9631356cce8b804a827b676c063382e81fc03c1a
check out the cpu load :) %13 with 1080p 60fps casting
https://github.com/hbiyik/FFmpeg/assets/24766436/6d7f80ca-1aab-4b6b-aefb-1605a07e80ac
@hbiyik Tested your implementation. It works somehow.
I can replay it on Intel Box, but not on rk3588. Attached h265 and h264.
Can you tell me what is your command? screencast_30fps_1920x1080.zip
%CPU usage 6.9% screencast_30fps_2.mp4.zip
Update: I can play with gstreamer on rk3588 but not with ffplay (ffplay works with virtually every movie i tried so far)
Update 2 mkv can be played with gstreamer, mp4 not.
Update 3 NV12, CPU% is 4%
./FFmpeg-encoder/ffmpeg -loglevel debug -framerate 30 -f kmsgrab -i - -c:v hevc_rkmpp_encoder -b:v 2000K -y screencast_30fps_3.mkv
I can replay it on Intel Box, but not on rk3588.
I can play with gstreamer on rk3588 but not with ffplay (ffplay works with virtually every movie i tried so far)
I think it is this issue: https://github.com/hbiyik/FFmpeg/issues/8, i need to sped some time on it but my guess is there is something wrong with time timestamps when encoding.
I also need to work on transcodin parts. It sould not work good enoguh on transcoding as of now.
But caapturing with KMs and decoder id done.
btw
NV12, CPU% is 4
this should be bgr0 not nv12
yes, for the mkv that's the issue. for the mp4 maybe this can help:
rm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '6'.
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f60000c10] Format mov,mp4,m4a,3gp,3g2,mj2 detected only with low score of 1, misdetection possible!
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f60000c10] moov atom not found
./screencast_30fps_4.mp4: Invalid data found when processing input
any idea how to open hdmin to capture? getting some funny errors with ffmpeg
I guess ffmpeg can not handle Multiplanar V4L2 Capture devices. Thats kindof odd...
below works for my usb webcam.
ffmpeg -f v4l2 -pix_fmt yuv420p -y -use_libv4l2 1 -i /dev/video1 -c:v hevc_rkmpp_encoder -b:v 20M output.mkv
but does not work for video0 which is multi planar capture device.
ffmpeg -f v4l2 -pix_fmt yuv420p -y -use_libv4l2 1 -i /dev/video0 -c:v hevc_rkmpp_encoder -b:v 20M output.mkv
https://github.com/FFmpeg/FFmpeg/blob/be91c5c80d8c66a5a39d52c21f9094c9c455ec84/libavdevice/v4l2.c#L184 This should be V4L2_CAP_VIDEO_CAPTURE_MPLANE.
See "Video Capture" capability for usb webcam
[alarm@alarm testfiles]$ v4l2-ctl --all -d /dev/video1
Driver Info:
Driver name : uvcvideo
Card type : Logitech Webcam C925e
Bus info : usb-xhci-hcd.11.auto-1
Driver version : 5.10.110
Capabilities : 0x84a00001
Video Capture
Metadata Capture
Streaming
Extended Pix Format
Device Capabilities
Device Caps : 0x04200001
Video Capture
Streaming
Extended Pix Format
See "Video Capture Multiplanar" capability for hdmiin
[alarm@alarm testfiles]$ v4l2-ctl --all -d /dev/video0
Driver Info:
Driver name : rk_hdmirx
Card type : rk_hdmirx
Bus info : fdee0000.hdmirx-controller
Driver version : 5.10.110
Capabilities : 0x84201000
Video Capture Multiplanar
Streaming
Extended Pix Format
Device Capabilities
Device Caps : 0x04201000
Video Capture Multiplanar
Streaming
Extended Pix Format
And i found out a work that supports multiplanar v4l2 devices https://github.com/rigaya/FFmpeg/commits/v4l2-multiplanar
@rigaya, do you have any plan to mainline those changes to ffmpeg?
@hbiyik If you are following your open issue there on Jeffy, there has been some improvement, can you try that on your board if you have 8GB/16GB?
i only have a 4gb model, but isnt this problem already solved if you change the allocator in mpp to specifically use "/dev/dma_heap/system-dma32", then the allocated space will never exceed 4GB address range.
https://github.com/rockchip-linux/mpp/issues/356#issuecomment-1427499145
Isnt thsi the same issue?
but isnt this problem already solved
I have all those workarounds + the patch to make RGA3 use dma32, but for this specific situation (i think the conversion) the limit is back. I have another kernel to test and check.
Does your implementation make use of RGA3 or libyuv? I don't get the 4GB limit error.
i think my implementation also should use RGA3 when:
1) the input video NV15 (hdr) and actual output is yuv420P or NV12. 2) the input video is NV15 (hdr) or NV16 (yuv422sp) and the output pixfmt is NV12.
But in any case RGA3 fails, i fallback to libyuv so it is quite transparent and effiicient almost not noticable.
i think you can test with "FFMPEG_RKMPP_PIXFMT=NV12 ffplay Poltergeist_DOM_TrailerD-1080p-HDTN.mp4".
I know this sepecific file is NV16. If it fails to RGA it, then it will spit a message like "RGA failed falling back to soft conversion"
Ahh, i get it now.
With @amazingfate workaround it uses RGA2 and not RGA3:
[ 483.762409] rga_debugger: mmu: mmu_flag=80000521 en=1
[ 483.762419] rga_debugger: alpha: rop_mode = 0
[ 483.762431] rga_debugger: yuv2rgb mode is 0
[ 483.762443] rga_debugger: set core = 0, priority = 0, in_fence_fd = 0
[ 483.762468] rga_policy: start policy on core = 1
[ 483.762487] rga_policy: core = 1, break on rga_check_dst
[ 483.762499] rga_policy: start policy on core = 2
[ 483.762510] rga_policy: core = 2, break on rga_check_dst
[ 483.762520] rga_policy: start policy on core = 4
[ 483.762534] rga_policy: optional_cores = 4
[ 483.762546] rga_policy: assign core: 4
[ 483.764833] rga2_reg: render_mode:bitblt,bitblit_mode=0,rotate_mode:0
[ 483.764856] rga2_reg: src: y=0 uv=1f6800 v=274200 aw=1888 ah=1062 vw=1920 vh=1072
[ 483.764871] rga2_reg: src: xoff=0 yoff=0 format=YCbCr422SP
[ 483.764889] rga2_reg: dst: y=0 uv=1fe000 v=27d800 aw=1888 ah=1062 vw=1920 vh=1088
[ 483.764902] rga2_reg: dst: xoff=0 yoff=0 format=YCbCr420P
[ 483.764917] rga2_reg: mmu: src=01 src1=00 dst=01 els=00
[ 483.764942] rga2_reg: alpha: flag 0 mode0=0 mode1=0
[ 483.764953] rga2_reg: blend mode is no blend
[ 483.764962] rga2_reg: yuv2rgb mode is 0
[ 483.765030] rga_job: job: reqeust_id = 616, priority = 0, core = 4
[ 483.769015] rga_job: request[616] finished 1 failed 0
[ 483.804024] rga: Blit mode: request id = 617
[ 483.804043] rga_debugger: render_mode = 0, bitblit_mode=0, rotate_mode = 0
[ 483.804064] rga_debugger: src: y = 34 uv = 0 v = 1f6800 aw = 1888 ah = 1062 vw = 1920 vh = 1072
[ 483.804078] rga_debugger: src: xoff = 0, yoff = 0, format = 0x8, rd_mode = 1
[ 483.804096] rga_debugger: dst: y=0 uv=7f55fe8000 v=7f561e6000 aw=1888 ah=1062 vw=1920 vh=1088
[ 483.804109] rga_debugger: dst: xoff = 0, yoff = 0, format = 0xb, rd_mode = 1
[ 483.804121] rga_debugger: mmu: mmu_flag=80000521 en=1
[ 483.804132] rga_debugger: alpha: rop_mode = 0
[ 483.804142] rga_debugger: yuv2rgb mode is 0
[ 483.804154] rga_debugger: set core = 0, priority = 0, in_fence_fd = 0
[ 483.804344] rga_policy: start policy on core = 1
[ 483.804364] rga_policy: core = 1, break on rga_check_dst
[ 483.804376] rga_policy: start policy on core = 2
[ 483.804388] rga_policy: core = 2, break on rga_check_dst
[ 483.804398] rga_policy: start policy on core = 4
[ 483.804413] rga_policy: optional_cores = 4
[ 483.804425] rga_policy: assign core: 4
[ 483.806468] rga2_reg: render_mode:bitblt,bitblit_mode=0,rotate_mode:0
[ 483.806489] rga2_reg: src: y=0 uv=1f6800 v=274200 aw=1888 ah=1062 vw=1920 vh=1072
[ 483.806505] rga2_reg: src: xoff=0 yoff=0 format=YCbCr422SP
[ 483.806522] rga2_reg: dst: y=0 uv=1fe000 v=27d800 aw=1888 ah=1062 vw=1920 vh=1088
[ 483.806537] rga2_reg: dst: xoff=0 yoff=0 format=YCbCr420P
[ 483.806560] rga2_reg: mmu: src=01 src1=00 dst=01 els=00
[ 483.806573] rga2_reg: alpha: flag 0 mode0=0 mode1=0
[ 483.806584] rga2_reg: blend mode is no blend
[ 483.806594] rga2_reg: yuv2rgb mode is 0
[ 483.806646] rga_job: job: reqeust_id = 617, priority = 0, core = 4
[ 483.810372] rga_job: request[617] finished 1 failed 0
With the workaround + patch it uses RGA3 and hit the 4GB limit. I will test a few things before getting some conclusions.
Maybe with official mpp + patch may work.
And i found out a work that supports multiplanar v4l2 devices https://github.com/rigaya/FFmpeg/commits/v4l2-multiplanar @rigaya, do you have any plan to mainline those changes to ffmpeg?
No, as this branch is focused to support RK3588 HDMI In for rkmppenc, which makes it support v4l2 RGB, YUV420, YUV444 (all 8bit) input in ROCK 5B HDMI In, and hw resize by librga, then hw encode by mpp.
Rock 5B HDMI In (tested with Ubuntu 20.04) v4l2 has some problems other than multi-planar format, it does not return properly to some v4l2 commands which ffmpeg uses (such as VIDIOC_S_INPUT, VIDIOC_G_INPUT, VIDIOC_G_PARM), causing further more errors in ffmpeg. The branch also includes work around for these, which I think is required to handle RK3588 HDMI In, but might not be appropriate for generic use.
And i found out a work that supports multiplanar v4l2 devices https://github.com/rigaya/FFmpeg/commits/v4l2-multiplanar @rigaya, do you have any plan to mainline those changes to ffmpeg?
No, as this branch is focused to support RK3588 HDMI In for rkmppenc, which makes it support v4l2 RGB, YUV420, YUV444 (all 8bit) input in ROCK 5B HDMI In, and hw resize by librga, then hw encode by mpp.
Rock 5B HDMI In (tested with Ubuntu 20.04) v4l2 has some problems other than multi-planar format, it does not return properly to some v4l2 commands which ffmpeg uses (such as VIDIOC_S_INPUT, VIDIOC_G_INPUT, VIDIOC_G_PARM), causing further more errors in ffmpeg. The branch also includes work around for these, which I think is required to handle RK3588 HDMI In, but might not be appropriate for generic use.
The main change of your branch is at file libavdevice/v4l2-common.c
and libavdevice/v4l2.c
, I think it should be possible to use it on obs, and use obs-gstreamer plugin to do the encoding job.
@rigaya ok understood, also i think you forgot to commit the change to ignore VIDIOC_G_INPUT, this is addtionally required to amke hdmiinput work.
See: https://github.com/hbiyik/FFmpeg/commit/48d00fc6bfb3b1db3bd53eb9378616a520af0b0f
@rigaya ok understood, also i think you forgot to commit the change to ignore VIDIOC_G_INPUT, this is addtionally required to amke hdmiinput work.
Actually, it's not forgotten, already there for my branch in https://github.com/rigaya/FFmpeg/commit/a91b4136cccf69954cef70e3f5fd3da3aabbc15a, I simply set "channel" and "ignore_input_error" via command line. Of course, it's effectively same with your commit.
@rigaya
when i try to change in the input format with either -input_format or -pixel_format, kernel always spits out below error
fdee0000.hdmirx-controller: hdmirx_s_fmt_vid_cap_mplane: err, set_fmt:0x2036315a, cur_fmt:0x3432564e!
when i look at the source of kernel driver at below line i see that code can never change the pixel format, because it returns Error if it is not same with the actual format (wtf code)
Were you ever able to change the input format of hdmirx device? May be instead of changing the format, there is a way to directly initialize with this specific format.
edit: my kernel is at: Linux alarm 5.10.110-37-rockchip-g74457be0716d #rockchip SMP Mon Feb 6 09:18:21 UTC 2023 aarch64 GNU/Linux
No, I also were not able to change input format from ffmpeg or even with v4l2-ctl.
I’m actually converting each format to YUV420 in rkmppenc to be able to handle all 4 formats(RGB, YUV420, 422, 444).
I see, it also look like a bug to me in the driver. BTW, at least for vepu580, it can directly accept nv24 aka YUV444SP, i am thinking directly using NV24 instead of using RGA. This will cause issues with vepu54x. I dont know which SOCs use vepu54x, but RK3588 at least uses vepu580.
in my case v4l hdmmin driver always defaults to NV24 and not possible to change it afterwards, i am not sure it is related to the hdmi device attached or not. I have support for nv12, nv16, and bgr0 as input format for encoder expect one format that v4l driver is stuck with which is NV24 :). Thanks rockchip for making things needlessly harder ..
Doe this work for you? v4l2-ctl --get-edid
I was able to switch from NV24 to BGR and back when i could get edid. /dev/video20 is my hdmirx in, your will be /dev/video0 if you don't have the camera attached.
v4l2-ctl -d /dev/video20 --get-edid > edid.txt v4l2-ctl -d /dev/video20 --set-edid file=edid.txt,format=hex,ycbcr422 and v4l2-ctl -d /dev/video20 --set-edid file=edid.txt,format=hex,ycbcr444
this seems to be working, i am no expert but this is something like telling the transmitter on te other end of the HDMI end use this specific format i guess. (totally guess working). The moment i ask ffmpeg to use NV16 (hex,ycbcr422) it fallbacks to NV24 again (hex,ycbcr444)
Is there a way to force rgb888?
Do you mean this format? v4l2-ctl -d /dev/video20 --all
Driver Info:
Driver name : rk_hdmirx
Card type : rk_hdmirx
Bus info : fdee0000.hdmirx-controller
Driver version : 5.10.110
Capabilities : 0x84201000
Video Capture Multiplanar
Streaming
Extended Pix Format
Device Capabilities
Device Caps : 0x04201000
Video Capture Multiplanar
Streaming
Extended Pix Format
Priority: 2
DV timings:
Active width: 1920
Active height: 1080
Total width: 2200
Total height: 1125
Frame format: progressive
Polarities: -vsync -hsync
Pixelclock: 148492000 Hz (60.00 frames per second)
Horizontal frontporch: 84
Horizontal sync: 48
Horizontal backporch: 148
Vertical frontporch: 4
Vertical sync: 5
Vertical backporch: 36
Standards:
Flags:
DV timings capabilities:
Minimum Width: 640
Maximum Width: 4096
Minimum Height: 480
Maximum Height: 2160
Minimum PClock: 20000000
Maximum PClock: 600000000
Standards: CTA-861
Capabilities: Interlaced, Progressive
Format Video Capture Multiplanar:
Width/Height : 1920/1080
Pixel Format : 'BGR3' (24-bit BGR 8-8-8)
Field : None
Number of planes : 1
Flags : premultiplied-alpha, 0x000000fe
Colorspace : Unknown (0x106b5bb0)
Transfer Function : Unknown (0x000000b8)
YCbCr/HSV Encoding: Unknown (0x000000ff)
Quantization : Default
Plane 0 :
Bytes per Line : 5760
Size Image : 6220800
Digital Video Controls
power_present 0x00a00964 (bitmask): max=0x00000001 default=0x00000000 value=1 flags=read-only
I know the basics... :)
EDID options:
--set-edid pad=<pad>[,type=<type>|file=<file>][,format=<fmt>][modifiers]
<pad> is the input index for which to set the EDID.
<type> can be one of:
list: list all EDID types
vga: Base Block supporting VGA interface (1920x1200p60)
dvid: Base Block supporting DVI-D interface (1920x1200p60)
hdmi: CTA-861 with HDMI support up to 1080p60
hdmi-4k-170mhz: CTA-861 with HDMI support up to 1080p60 or 4kp30 4:2:0
hdmi-4k-300mhz: CTA-861 with HDMI support up to 4kp30
hdmi-4k-600mhz: CTA-861 with HDMI support up to 4kp60
hdmi-4k-600mhz-with-displayid: Block Map Extension Block, CTA-861 with
HDMI support up to 4kp60, DisplayID Extension Block
displayport: DisplayID supporting a DisplayPort interface (1920x1200)
displayport-with-cta861: DisplayID supporting a DisplayPort interface,
CTA-861 Extension Block (1080p60)
If <file> is '-', then the data is read from stdin, otherwise it is
read from the given file. The file format must be in hex as in get-edid.
The 'type' or 'file' arguments are mutually exclusive. One of the two
must be specified.
<fmt> is one of:
hex: hex numbers in ascii text (default)
raw: raw binary EDID content
[modifiers] is a comma-separate list of EDID modifiers:
CTA-861 Header modifiers:
underscan: toggle the underscan bit.
audio: toggle the audio bit.
ycbcr444: toggle the YCbCr 4:4:4 bit.
ycbcr422: toggle the YCbCr 4:2:2 bit.
Speaker Allocation Data Block modifiers:
fl-fr: Front Left/Right.
lfe: Low Frequency Effects.
fc: Front Center.
bl-br: Back Left/Right.
bc: Back Center.
rlc-frc: Front Left/Right of Center.
rlc-rrc: Rear Left/Right of Center.
flw-frw: Front Left/Right Wide.
tpfl-tpfr: Top Front Left/Right.
tpc: Top Center.
tpfc: Top Front Center.
ls-rs: Left/Right Surround.
lfe2: Low Frequency Effects 2.
tpbc: Top Back Center.
sil-sir: Side Left/Right
tpsil-tpsir: Top Side Left/Right.
tpbl-tpbr: Top Back Left/Right.
btfc: Bottom Front Center.
btfl-btbr: Bottom Front Left/Right.
tpls-tprs: Top Left/Right Surround.
HDMI Vendor-Specific Data Block modifiers:
pa=<pa>: change the physical address.
y444: toggle the YCbCr 4:4:4 Deep Color bit.
30-bit: toggle the 30 bits/pixel bit.
36-bit: toggle the 36 bits/pixel bit.
48-bit: toggle the 48 bits/pixel bit.
graphics: toggle the Graphics Content Type bit.
photo: toggle the Photo Content Type bit.
cinema: toggle the Cinema Content Type bit.
game: toggle the Game Content Type bit.
HDMI Forum Vendor-Specific Data Block modifiers:
scdc: toggle the SCDC Present bit.
CTA-861 Video Capability Descriptor modifiers:
qy: toggle the QY YCC Quantization Range bit.
qs: toggle the QS RGB Quantization Range bit.
s-pt=<0-3>: set the PT Preferred Format Over/underscan bits.
s-it=<0-3>: set the IT Over/underscan bits.
s-ce=<0-3>: set the CE Over/underscan bits.
CTA-861 Colorimetry Data Block modifiers:
xvycc-601: toggle the xvYCC 601 bit.
xvycc-709: toggle the xvYCC 709 bit.
sycc: toggle the sYCC 601 bit.
opycc: toggle the opYCC 601 bit.
oprgb: toggle the opRGB bit.
bt2020-rgb: toggle the BT2020 RGB bit.
bt2020-ycc: toggle the BT2020 YCC bit.
bt2020-cycc: toggle the BT2020 cYCC bit.
dci-p3: toggle the DCI-P3 bit.
CTA-861 HDR Static Metadata Data Block modifiers:
sdr: toggle the Traditional gamma SDR bit.
hdr: toggle the Traditional gamma HDR bit.
smpte2084: toggle the SMPTE ST 2084 bit.
hlg: toggle the Hybrid Log-Gamma bit.
In my case i have only this:
v4l2-ctl -d /dev/video20 --list-formats ioctl: VIDIOC_ENUM_FMT Type: Video Capture Multiplanar
[0]: 'BGR3' (24-bit BGR 8-8-8)
[1]: 'NV24' (Y/CbCr 4:4:4)
[2]: 'NV16' (Y/CbCr 4:2:2)
[3]: 'NV12' (Y/CbCr 4:2:0)
i get error if i try: v4l2-ctl --set-edid file=edid.txt,format=hex,xrgb888 or 4l2-ctl --set-edid file=edid.txt,format=hex,xrgb8888
HDMI in Support FMT: RGB888/YUV420/YUV422/YUV444 8bit
I believe it means 24bit rgb888.
@avafinger h264 encoder problem is solved at 20985fa466ef6a07c73b53393356748d95a75fb7, now the output encoded h264 files can be payed back with mpp. Hevc still has issues though, i think this might be an mpp issue.
do you know if hevc encoder works in gstreamer? or somewhere else..
hevc encoder works on gstreamer
I can confirm it's working for h264. A note about the colors, it looks washed out.
i did not notice such a thing, here is the ss for a transcode and screencast in my case. Am i not seeing something?
left is screen, right is recoding
left is original, rigth is transcoded
You can see it on "Big Buck Bunny, Sun Flower......" title. The Left (original) is white, transcoded is grey. Maybe the problem is only with the white color.
Seems to be color saturation.
Original: Transcoded
You can see it on "Big Buck Bunny, Sun Flower......" title. The Left (original) is white, transcoded is grey. Maybe the problem is only with the white color.
I think thats because the title of the player was not active, but i saw what you mean in your example
vs
and i think thats because encoding parameters. You can change
the bitrate: -b
gop size: -g
profile: -profile
level: -level
ratecontrol: -rc_mode
coder type: -coder
8x8 transform: -8x8dc
variables. I think ffmpeg -h decoder=h264_rkmpp_decoder
would give a tip what the configurable values are.
if you inject mpp_cfg_debug=1
environment value, your dmesg
will print the actual encoder configuration sent to hardware. in that way you can compare several implementations as well like gstream, mpp_enc_test, or ffmpeg.
I think specially the rc_mode could be the culprit, should default to AVBR which is not the best mode
Here are the mpp parms from gstreamer without passing any parms , h264, matrox. The encoded file is worst than ffmpeg. Can you suggest some parms?
video_kms_1920x1080_30fps_h264.mkv.zip
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 prep:format keep 0
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 prep:width update 1280 -> 1920
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 prep:height update 720 -> 1080
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 prep:hor_stride update 1280 -> 1920
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 prep:ver_stride update 720 -> 1088
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 rc:fps_in_flex keep 0
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 rc:fps_in_num keep 30
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 rc:fps_in_denorm keep 1
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 rc:fps_out_flex keep 0
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 rc:fps_out_num keep 30
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 rc:fps_out_denorm keep 1
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 h264:qp_init keep 26
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 h264:qp_max update 48 -> 28
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 h264:qp_min update 8 -> 4
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 h264:qp_step update 0 -> 8
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 h264:profile update 66 -> 100
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 h264:level update 31 -> 40
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 h264:trans8x8 update 0 -> 1
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 h264:cabac_en update 0 -> 1
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 h264:cabac_idc keep 0
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 rc:gop update 60 -> 30
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set u32 rc:max_reenc_times keep 1
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 rc:mode update 0 -> 1
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 rc:bps_target update 2000000 -> 7776000
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 rc:bps_max update 2500000 -> 8262000
Jun 23 12:17:01 rock5b mpp[51045]: mpp_cfg: set s32 rc:bps_min update 1500000 -> 7290000
Jun 23 12:17:01 rock5b mpp[51045]: mpp_enc: MPP_ENC_SET_RC_CFG bps 7776000 [7290000 : 8262000] fps [30:30] gop 30
Jun 23 12:17:01 rock5b mpp[51045]: h264e_api_v2: MPP_ENC_SET_PREP_CFG w:h [1920:1080] stride [1920:1088]
Jun 23 12:17:01 rock5b mpp[51045]: mpp_enc: mode cbr bps [7290000:7776000:8262000] fps fix [30/1] -> fix [30/1] gop i [30] v [0]
Higher bitrate: video_kms_1920x1080_30fps_h264_2.mkv.zip
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 prep:format keep 0
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 prep:width update 1280 -> 1920
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 prep:height update 720 -> 1080
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 prep:hor_stride update 1280 -> 1920
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 prep:ver_stride update 720 -> 1088
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 rc:fps_in_flex keep 0
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 rc:fps_in_num keep 30
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 rc:fps_in_denorm keep 1
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 rc:fps_out_flex keep 0
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 rc:fps_out_num keep 30
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 rc:fps_out_denorm keep 1
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 h264:qp_init keep 26
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 h264:qp_max update 48 -> 40
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 h264:qp_min update 8 -> 12
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 h264:qp_step update 0 -> 8
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 h264:profile update 66 -> 100
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 h264:level update 31 -> 40
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 h264:trans8x8 update 0 -> 1
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 h264:cabac_en update 0 -> 1
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 h264:cabac_idc keep 0
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 rc:gop update 60 -> 30
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set u32 rc:max_reenc_times keep 1
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 rc:mode keep 0
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 rc:bps_target update 2000000 -> 8000000
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 rc:bps_max update 2500000 -> 9172000
Jun 23 12:35:19 rock5b mpp[51477]: mpp_cfg: set s32 rc:bps_min update 1500000 -> 500000
Jun 23 12:35:19 rock5b mpp[51477]: mpp_enc: MPP_ENC_SET_RC_CFG bps 8000000 [500000 : 9172000] fps [30:30] gop 30
Jun 23 12:35:19 rock5b mpp[51477]: h264e_api_v2: MPP_ENC_SET_PREP_CFG w:h [1920:1080] stride [1920:1088]
Jun 23 12:35:19 rock5b mpp[51477]: mpp_enc: mode vbr bps [500000:8000000:9172000] fps fix [30/1] -> fix [30/1] gop i [30] v [0]
to be honest i also dont know. But i agree for some reasons regardless of interface you use (even default mpp_enc_test) the queality is not the best. And i think thats because the QP values are in full range of 0-51 by default, which means when in dire situations, encoder can lower the quality as much as it requires. I need to play with those little bit to find a sweet spot.
I have created a new issue for the coloring, at least the kmsgrab and recording stuff has been fixed, im closing this issue before it becomes another chat window :)
Here is the new issue: https://github.com/hbiyik/FFmpeg/issues/9
test with:
./FFmpeg-encoder/ffmpeg -loglevel debug -device /dev/dri/card0 -framerate 30 -f kmsgrab -i - -vf 'hwdownload,format=bgr0' -c:v hevc_rkmpp_encoder -b:v 2000K screencast_30fps_4.mkv