dankamongmen / notcurses

blingful character graphics/TUI library. definitely not curses.
https://nick-black.com/dankwiki/index.php/Notcurses
Other
3.61k stars 113 forks source link

valgrind warnings in ffmpeg unit tests #1675

Open dankamongmen opened 3 years ago

dankamongmen commented 3 years ago

I ran notcurses-tester in valgrind, and got the following. Not sure if it's a general problem or just in the tests.

==129372== Invalid read of size 16
==129372==    at 0x6B1B48D: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6AC9BF8: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6B0383D: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6B04707: sws_scale (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x4A5265B: ffmpeg_blit (ffmpeg.c:496)
==129372==    by 0x4A9488D: ncvisual_blit (visual.c:20)
==129372==    by 0x4A95CD1: ncvisual_render_cells (visual.c:662)
==129372==    by 0x4A96A88: ncvisual_render (visual.c:890)
==129372==    by 0x231D8D: _DOCTEST_ANON_FUNC_2() (visual.cpp:590)
==129372==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==129372==    by 0x1514A7: main (main.cpp:165)
==129372==  Address 0x104d9cb0 is 12 bytes after a block of size 4 alloc'd
==129372==    at 0x483877F: malloc (vg_replace_malloc.c:307)
==129372==    by 0x4A96B6F: memdup (internal.h:1149)
==129372==    by 0x4A96B6F: ncvisual_from_rgba (visual.c:532)
==129372==    by 0x231CA4: _DOCTEST_ANON_FUNC_2() (visual.cpp:588)
==129372==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==129372==    by 0x1514A7: main (main.cpp:165)
==129372== 
==129372== Invalid read of size 4
==129372==    at 0x6B1B4EE: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6AC9BF8: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6B0383D: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6B04707: sws_scale (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x4A5265B: ffmpeg_blit (ffmpeg.c:496)
==129372==    by 0x4A9488D: ncvisual_blit (visual.c:20)
==129372==    by 0x4A95CD1: ncvisual_render_cells (visual.c:662)
==129372==    by 0x4A96A88: ncvisual_render (visual.c:890)
==129372==    by 0x231D8D: _DOCTEST_ANON_FUNC_2() (visual.cpp:590)
==129372==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==129372==    by 0x1514A7: main (main.cpp:165)
==129372==  Address 0x104d9cc0 is 0 bytes after a block of size 32 in arena "client"
==129372== 
==129372== Invalid read of size 16
==129372==    at 0x6B1B5E3: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6AC9E2C: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6B037CD: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6B04707: sws_scale (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x4A5265B: ffmpeg_blit (ffmpeg.c:496)
==129372==    by 0x4A9488D: ncvisual_blit (visual.c:20)
==129372==    by 0x4A95CD1: ncvisual_render_cells (visual.c:662)
==129372==    by 0x4A96A88: ncvisual_render (visual.c:890)
==129372==    by 0x231D8D: _DOCTEST_ANON_FUNC_2() (visual.cpp:590)
==129372==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==129372==    by 0x1514A7: main (main.cpp:165)
==129372==  Address 0x104d9cb0 is 12 bytes after a block of size 4 alloc'd
==129372==    at 0x483877F: malloc (vg_replace_malloc.c:307)
==129372==    by 0x4A96B6F: memdup (internal.h:1149)
==129372==    by 0x4A96B6F: ncvisual_from_rgba (visual.c:532)
==129372==    by 0x231CA4: _DOCTEST_ANON_FUNC_2() (visual.cpp:588)
==129372==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==129372==    by 0x1514A7: main (main.cpp:165)
==129372== 
==129372== Invalid read of size 4
==129372==    at 0x6B1B688: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6AC9E2C: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6B037CD: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6B04707: sws_scale (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x4A5265B: ffmpeg_blit (ffmpeg.c:496)
==129372==    by 0x4A9488D: ncvisual_blit (visual.c:20)
==129372==    by 0x4A95CD1: ncvisual_render_cells (visual.c:662)
==129372==    by 0x4A96A88: ncvisual_render (visual.c:890)
==129372==    by 0x231D8D: _DOCTEST_ANON_FUNC_2() (visual.cpp:590)
==129372==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==129372==    by 0x1514A7: main (main.cpp:165)
==129372==  Address 0x104d9cc0 is 0 bytes after a block of size 32 in arena "client"
==129372== 
==129372== Conditional jump or move depends on uninitialised value(s)
==129372==    at 0x6ADE5EB: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6B19C8C: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6B03580: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6B04707: sws_scale (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x4A5265B: ffmpeg_blit (ffmpeg.c:496)
==129372==    by 0x4A9488D: ncvisual_blit (visual.c:20)
==129372==    by 0x4A95CD1: ncvisual_render_cells (visual.c:662)
==129372==    by 0x4A96A88: ncvisual_render (visual.c:890)
==129372==    by 0x231D8D: _DOCTEST_ANON_FUNC_2() (visual.cpp:590)
==129372==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==129372==    by 0x1514A7: main (main.cpp:165)
==129372== 
==129372== Conditional jump or move depends on uninitialised value(s)
==129372==    at 0x6ADE64C: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6B19C8C: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6B03580: ??? (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x6B04707: sws_scale (in /usr/lib/x86_64-linux-gnu/libswscale.so.5.7.100)
==129372==    by 0x4A5265B: ffmpeg_blit (ffmpeg.c:496)
==129372==    by 0x4A9488D: ncvisual_blit (visual.c:20)
==129372==    by 0x4A95CD1: ncvisual_render_cells (visual.c:662)
==129372==    by 0x4A96A88: ncvisual_render (visual.c:890)
==129372==    by 0x231D8D: _DOCTEST_ANON_FUNC_2() (visual.cpp:590)
==129372==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==129372==    by 0x1514A7: main (main.cpp:165)
==129372== 
==129372== Conditional jump or move depends on uninitialised value(s)
==129372==    at 0x4A65603: rgba_trans_p (internal.h:1636)
==129372==    by 0x4A65603: rgba_trans_q (blit.c:36)
==129372==    by 0x4A65603: tria_blit_ascii (blit.c:74)
==129372==    by 0x4A52684: rgba_blit_dispatch (internal.h:1683)
==129372==    by 0x4A52684: ffmpeg_blit (ffmpeg.c:511)
==129372==    by 0x4A9488D: ncvisual_blit (visual.c:20)
==129372==    by 0x4A95CD1: ncvisual_render_cells (visual.c:662)
==129372==    by 0x4A96A88: ncvisual_render (visual.c:890)
==129372==    by 0x231D8D: _DOCTEST_ANON_FUNC_2() (visual.cpp:590)
==129372==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==129372==    by 0x1514A7: main (main.cpp:165)
==129372== 
dankamongmen commented 3 years ago

nope, this looks like a straight-up error in ncvisual_from_{rgba, bgra}(). the former is used by the ncvisual_from_rgb_{packed, loose}() functions we added last week for @kaniini ; did you run into any problems using them?

kaniini commented 3 years ago

I haven't tried yet because all this freenode shit went down.

dankamongmen commented 3 years ago

ok, i've limited this to the case where we fill from ncvisual_from_rgba() or something similar (i.e. not ncvisual_from_file()), and then implicitly scale for the blit. if we load from a file and scale, this doesn't happen. if we load from memory and don't scale, this doesn't happen. that suggests to me that we're doing some setup in ncvisual_from_file() that we're not doing otherwise.

it's probably best at this point to go digging around in libav source.

dankamongmen commented 3 years ago

also, it appears to depend on how we're doing the implicit scaling. if i scale to 4x4 from 2x2 or 1x1, it works fine. if i scale to 5x5 from either, we get warnings. interesting. 1x1 to 3x3 also works, as does 1x1 to 6x6. 1x1 to 5x5 breaks.

dankamongmen commented 3 years ago

with debug symbols present, we get some more information:

ncplane_new_internal:452:Created new 70x107 plane "std" @ 0x0
ncplane_new_internal:452:Created new 5x5 plane "parent" @ 0x0
LOADED NCVISUAL 1/1
ncplane_new_internal:452:Created new 5x5 plane "cvis" @ 0x0
got format: 26 (1/1) want format: 26
WHN NCV: 1/1 bargslen: 1/1
src 1/1 -> targ 5/5 ctx: (nil)
[swscaler @ 0xeb8c800] Forcing full internal H chroma due to odd output size
[swscaler @ 0xeb8c800] Forcing full internal H chroma due to input having non subsampled chroma
INFRAME DAA: 0xeb8bed0 SDATA: 0xeba47c0 FDATA: 0xeb8bed0 to 5/5
[swscaler @ 0xeb8c800] Warning: data is not aligned! This can lead to a speed loss
==159224== Invalid read of size 16
==159224==    at 0x6B1C48D: ??? (src//libavutil/x86/x86util.asm:1736)
==159224==    by 0x6ACABF8: lum_convert (hscale.c:108)
==159224==    by 0x6B0483D: swscale (swscale.c:457)
==159224==    by 0x6B05707: sws_scale (swscale.c:969)
==159224==    by 0x4A53710: ffmpeg_blit (ffmpeg.c:497)
==159224==    by 0x4A957ED: ncvisual_blit (visual.c:20)
==159224==    by 0x4A96C31: ncvisual_render_cells (visual.c:666)
==159224==    by 0x4A979E8: ncvisual_render (visual.c:894)
==159224==    by 0x231DAF: _DOCTEST_ANON_FUNC_2() (visual.cpp:591)
==159224==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==159224==    by 0x1514A7: main (main.cpp:165)
==159224==  Address 0xeb8bee0 is 12 bytes after a block of size 4 alloc'd
==159224==    at 0x483877F: malloc (vg_replace_malloc.c:307)
==159224==    by 0x4A97ADA: ncvisual_from_rgba (visual.c:533)
==159224==    by 0x231CA4: _DOCTEST_ANON_FUNC_2() (visual.cpp:588)
==159224==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==159224==    by 0x1514A7: main (main.cpp:165)
==159224== 
==159224== Invalid read of size 4
==159224==    at 0x6B1C4EE: ??? (src//libavutil/x86/x86util.asm:1317)
==159224==    by 0x6ACABF8: lum_convert (hscale.c:108)
==159224==    by 0x6B0483D: swscale (swscale.c:457)
==159224==    by 0x6B05707: sws_scale (swscale.c:969)
==159224==    by 0x4A53710: ffmpeg_blit (ffmpeg.c:497)
==159224==    by 0x4A957ED: ncvisual_blit (visual.c:20)
==159224==    by 0x4A96C31: ncvisual_render_cells (visual.c:666)
==159224==    by 0x4A979E8: ncvisual_render (visual.c:894)
==159224==    by 0x231DAF: _DOCTEST_ANON_FUNC_2() (visual.cpp:591)
==159224==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==159224==    by 0x1514A7: main (main.cpp:165)
==159224==  Address 0xeb8bef0 is 32 bytes inside a block of size 48 in arena "client"
==159224== 
==159224== Invalid read of size 16
==159224==    at 0x6B1C5E3: ??? (src//libavutil/x86/x86util.asm:1736)
==159224==    by 0x6ACAE2C: chr_convert (hscale.c:227)
==159224==    by 0x6B047CD: swscale (swscale.c:464)
==159224==    by 0x6B05707: sws_scale (swscale.c:969)
==159224==    by 0x4A53710: ffmpeg_blit (ffmpeg.c:497)
==159224==    by 0x4A957ED: ncvisual_blit (visual.c:20)
==159224==    by 0x4A96C31: ncvisual_render_cells (visual.c:666)
==159224==    by 0x4A979E8: ncvisual_render (visual.c:894)
==159224==    by 0x231DAF: _DOCTEST_ANON_FUNC_2() (visual.cpp:591)
==159224==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==159224==    by 0x1514A7: main (main.cpp:165)
==159224==  Address 0xeb8bee0 is 12 bytes after a block of size 4 alloc'd
==159224==    at 0x483877F: malloc (vg_replace_malloc.c:307)
==159224==    by 0x4A97ADA: ncvisual_from_rgba (visual.c:533)
==159224==    by 0x231CA4: _DOCTEST_ANON_FUNC_2() (visual.cpp:588)
==159224==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==159224==    by 0x1514A7: main (main.cpp:165)
==159224== 
dankamongmen commented 3 years ago

this reeks of failure to pad the input properly

dankamongmen commented 3 years ago

making the input source lines a multiple of 32 bytes eliminates the "data is unaligned!" warning, but we still hit the same valgrind issues.

turned up ncv->rowstride 4 -> 32
LOADED NCVISUAL 1/1
ncplane_new_internal:452:Created new 5x5 plane "cvis" @ 0x0
got format: 26 (1/1) want format: 26
WHN NCV: 1/1 bargslen: 1/1
src 1/1 -> targ 5/5 ctx: (nil)
[swscaler @ 0xeb8c800] Forcing full internal H chroma due to odd output size
[swscaler @ 0xeb8c800] Forcing full internal H chroma due to input having non subsampled chroma
INFRAME DAA: 0xeb8bed0 SDATA: 0xeba47c0 FDATA: 0xeb8bed0 to 5/5
==947881== Invalid read of size 4
==947881==    at 0x6B1C4EE: ??? (src//libavutil/x86/x86util.asm:1317)
==947881==    by 0x6ACABF8: lum_convert (hscale.c:108)
==947881==    by 0x6B0483D: swscale (swscale.c:457)
==947881==    by 0x6B05707: sws_scale (swscale.c:969)
==947881==    by 0x4A53710: ffmpeg_blit (ffmpeg.c:497)
==947881==    by 0x4A957ED: ncvisual_blit (visual.c:20)
==947881==    by 0x4A96C31: ncvisual_render_cells (visual.c:670)
==947881==    by 0x4A979E8: ncvisual_render (visual.c:898)
==947881==    by 0x231DAF: _DOCTEST_ANON_FUNC_2() (visual.cpp:591)
==947881==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==947881==    by 0x1514A7: main (main.cpp:165)
==947881==  Address 0xeb8bef0 is 0 bytes after a block of size 32 alloc'd
==947881==    at 0x483877F: malloc (vg_replace_malloc.c:307)
==947881==    by 0x4A97ABD: ncvisual_from_rgba (visual.c:537)
==947881==    by 0x231CA4: _DOCTEST_ANON_FUNC_2() (visual.cpp:588)
==947881==    by 0x1AD66D: doctest::Context::run() (doctest.h:6414)
==947881==    by 0x1514A7: main (main.cpp:165)
dankamongmen commented 3 years ago

yeah. confirmed that ncvisual_from_file() works here just fine:

ncplane_new_internal:452:Created new 70x107 plane "std" @ 0x0
ncplane_new_internal:452:Created new 5x5 plane "parent" @ 0x0
[NULL @ 0xeb8c4c0] Opening '../data/onedot.png' for reading
[file @ 0xeb8cf00] Setting default whitelist 'file,crypto,data'
Probing image2 score:50 size:546
Probing png_pipe score:99 size:546
[png_pipe @ 0xeb8c4c0] Format png_pipe probed with size=2048 and score=99
[png_pipe @ 0xeb8c4c0] Before avformat_find_stream_info() pos: 0 bytes read:546 seeks:0 nb_streams:1
[png_pipe @ 0xeb8c4c0] stream 0: start_time: NOPTS duration: NOPTS
[png_pipe @ 0xeb8c4c0] format: start_time: NOPTS duration: NOPTS (estimate from bit rate) bitrate=0 kb/s
[png_pipe @ 0xeb8c4c0] After avformat_find_stream_info() pos: 546 bytes read:546 seeks:0 frames:1
LOADED NCVISUAL 1/1
ncplane_new_internal:452:Created new 5x5 plane "cvis" @ 0x0
got format: 2 (1/1) want format: 26
WHN NCV: 1/1 bargslen: 1/1
src 1/1 -> targ 5/5 ctx: (nil)
[swscaler @ 0xebafa80] Forcing full internal H chroma due to odd output size
[swscaler @ 0xebafa80] Forcing full internal H chroma due to input having non subsampled chroma
INFRAME DAA: 0xebae5c0 SDATA: 0xebc7a00 FDATA: 0xebae5c0 to 5/5
RENDERED NCVISUAL 1/1->5/5
[AVIOContext @ 0xeb957c0] Statistics: 546 bytes read, 0 seeks
ncplane_destroy:736:Destroying 5x5 plane "parent" @ 0x0
ncplane_destroy:736:Destroying 5x5 plane "cvis" @ 0x0

schwarzgerat $ file ../data/onedot.png ../data/onedot.png: PNG image data, 1 x 1, 8-bit/color RGB, non-interlaced schwarzgerat $

dankamongmen commented 3 years ago

one interesting difference here is that the ncvisual prepared from the file has format 2 (AV_PIX_FMT_RGB24), whereas the one prepared from memory has format 26 (the desired target format, AV_PIX_FMT_RGBA).

i thought we were forcing a conversion to AV_PIX_FMT_RGBA immediately upon load, but apparently it's not until render time. that seems...not good, since the entire ncpixel_*() API is based around an assumption of RGBA (all of our blitters are written in terms of RGBA input). this is why we convert e.g. ncvisual_from_rgb_{loose,packed}() input and ncvisual_from_bgra(). so doesn't this mean any pixel work on a visual loaded from disk not using RGBA is going to break the ncpixel API (unless it's resized manually first, which forces an RGBA conversion)?

sounds like we need a unit test for this. make a PNG without an alpha channel, which ought be AV_PIX_FMT_RGB24. then fuck around with it hard using the pixel API.

note that we could change the pixel API to be multiformat aware, by inserting necessary single-pixel conversion and deconversion in ncvisual_at_yx() and ncvisual_set_yx().

if we do an RGBA conversion immediately upon decode. this shouldn't cost us anything, since we would otherwise do one implicitly during blitting (though this would be interwoven with the implicit scaling) or implicitly during resize (same deal). so if we know the desired output size, that would be useful information at the time of this conversion...but yeah, right now the ncpixel api is broken for any image that isn't RGBA. that's gonna need its own bug.

dankamongmen commented 3 years ago

here's how it emerges from ffmpeg_decode():

[NULL @ 0xebde440] Opening '../data/onedot.png' for reading
[file @ 0xebdee80] Setting default whitelist 'file,crypto,data'
Probing image2 score:50 size:546
Probing png_pipe score:99 size:546
[png_pipe @ 0xebde440] Format png_pipe probed with size=2048 and score=99
[png_pipe @ 0xebde440] Before avformat_find_stream_info() pos: 0 bytes read:546 seeks:0 nb_streams:1
[png_pipe @ 0xebde440] stream 0: start_time: NOPTS duration: NOPTS
[png_pipe @ 0xebde440] format: start_time: NOPTS duration: NOPTS (estimate from bit rate) bitrate=0 kb/s
[png_pipe @ 0xebde440] After avformat_find_stream_info() pos: 546 bytes read:546 seeks:0 frames:1
Frame 00001 (0? 0?) 1x1 pfmt 2 (rgb24             3         24)
 Data (8): 0xec00540 (nil) (nil) (nil) (nil) (nil) (nil) (nil)
 Linesizes: 192 0 0 0 0 0 0 0
 Aspect ratio 11811:11811 PTS 0 Flags: 0x0000
 1ms@0ms (keyframe) qual: 0
ncplane_new_internal:452:Created new 5x5 plane "cvis" @ 0x0
got format: 2 (1/1) want format: 26
WHN NCV: 1/1 bargslen: 1/1
src 1/1 -> targ 5/5 ctx: (nil)
[swscaler @ 0xec01a00] Forcing full internal H chroma due to odd output size
[swscaler @ 0xec01a00] Forcing full internal H chroma due to input having non subsampled chroma
INFRAME DAA: 0xec00540 SDATA: 0xec19980 FDATA: 0xec00540 to 5/5
[AVIOContext @ 0xebe7740] Statistics: 546 bytes read, 0 seeks
ncplane_destroy:736:Destroying 5x5 plane "parent" @ 0x0
ncplane_destroy:736:Destroying 5x5 plane "cvis" @ 0x0
dankamongmen commented 3 years ago

so that's a 192B line for a single 3B pixel. 192 / 3 == 64; 192 / 4 == 48; 192 / 16 == 12; 192 / 32 == 6; 192 / 64 == 3

for our memory-loaded one, we've got

turned up ncv->rowstride 4 -> 32
LOADED NCVISUAL 1/1
ncplane_new_internal:452:Created new 5x5 plane "cvis" @ 0x0
Frame 00000 0xeb8c000 (0? 0?) 1x1 pfmt 26 (rgba              4         32)
 Data (8): 0xeb8c290 (nil) (nil) (nil) (nil) (nil) (nil) (nil)
 Linesizes: 32 0 0 0 0 0 0 0
 Aspect ratio unknown PTS -9223372036854775808 Flags: 0x0000
 0ms@9223372036854775808ms (keyframe) qual: 0
got format: 26 (1/1) want format: 26
WHN NCV: 1/1 bargslen: 1/1
src 1/1 -> targ 5/5 ctx: (nil)
[swscaler @ 0xeb8c900] Forcing full internal H chroma due to odd output size
[swscaler @ 0xeb8c900] Forcing full internal H chroma due to input having non subsampled chroma
INFRAME DAA: 0xeb8c290 SDATA: 0xeba48c0 FDATA: 0xeb8c290 to 5/5
dankamongmen commented 3 years ago

let's try taking the linesize alignment up to 192, fuck me in the ass

dankamongmen commented 3 years ago

hah yep that got rid of at least the first ones, motherfuck. now seeing some conditioinal jumps on uninitialized values, but they're much less angry:

turned up ncv->rowstride 4 -> 192
LOADED NCVISUAL 1/1
ncplane_new_internal:452:Created new 5x5 plane "cvis" @ 0x0
Frame 00000 0xeb8c5c0 (0? 0?) 1x1 pfmt 26 (rgba              4         32)
 Data (8): 0xeb8c860 (nil) (nil) (nil) (nil) (nil) (nil) (nil)
 Linesizes: 192 0 0 0 0 0 0 0
 Aspect ratio unknown PTS -9223372036854775808 Flags: 0x0000
 0ms@9223372036854775808ms (keyframe) qual: 0
got format: 26 (1/1) want format: 26
WHN NCV: 1/1 bargslen: 1/1
src 1/1 -> targ 5/5 ctx: (nil)
[swscaler @ 0xeb8cfc0] Forcing full internal H chroma due to odd output size
[swscaler @ 0xeb8cfc0] Forcing full internal H chroma due to input having non subsampled chroma
INFRAME DAA: 0xeb8c860 SDATA: 0xeba4f80 FDATA: 0xeb8c860 to 5/5
==1479050== Conditional jump or move depends on uninitialised value(s)
==1479050==    at 0x6ADF5EB: yuv2rgb_full_1_c_template (output.c:2102)
==1479050==    by 0x6ADF5EB: yuv2rgba32_full_1_c (output.c:2142)
==1479050==    by 0x6B1AC8C: packed_vscale (vscale.c:141)
==1479050==    by 0x6B04580: swscale (swscale.c:491)
==1479050==    by 0x6B05707: sws_scale (swscale.c:969)
==1479050==    by 0x4A5398A: ffmpeg_blit (ffmpeg.c:497)
==1479050==    by 0x4A957ED: ncvisual_blit (visual.c:20)
==1479050==    by 0x4A96C31: ncvisual_render_cells (visual.c:672)
==1479050==    by 0x4A979E8: ncvisual_render (visual.c:900)
==1479050==    by 0x231FB2: _DOCTEST_ANON_FUNC_2() (visual.cpp:589)
==1479050==    by 0x1AD88D: doctest::Context::run() (doctest.h:6414)
==1479050==    by 0x1516C7: main (main.cpp:165)
==1479050== 
==1479050== Conditional jump or move depends on uninitialised value(s)
==1479050==    at 0x6ADF64C: yuv2rgb_write_full (output.c:1856)
==1479050==    by 0x6ADF64C: yuv2rgb_full_1_c_template (output.c:2106)
==1479050==    by 0x6ADF64C: yuv2rgba32_full_1_c (output.c:2142)
==1479050==    by 0x6B1AC8C: packed_vscale (vscale.c:141)
==1479050==    by 0x6B04580: swscale (swscale.c:491)
==1479050==    by 0x6B05707: sws_scale (swscale.c:969)
==1479050==    by 0x4A5398A: ffmpeg_blit (ffmpeg.c:497)
==1479050==    by 0x4A957ED: ncvisual_blit (visual.c:20)
==1479050==    by 0x4A96C31: ncvisual_render_cells (visual.c:672)
==1479050==    by 0x4A979E8: ncvisual_render (visual.c:900)
==1479050==    by 0x231FB2: _DOCTEST_ANON_FUNC_2() (visual.cpp:589)
==1479050==    by 0x1AD88D: doctest::Context::run() (doctest.h:6414)
==1479050==    by 0x1516C7: main (main.cpp:165)
==1479050== 
==1479050== Conditional jump or move depends on uninitialised value(s)
==1479050==    at 0x4A66603: rgba_trans_p (internal.h:1636)
==1479050==    by 0x4A66603: rgba_trans_q (blit.c:38)
==1479050==    by 0x4A66603: tria_blit_ascii (blit.c:76)
==1479050==    by 0x4A539B4: rgba_blit_dispatch (internal.h:1683)
==1479050==    by 0x4A539B4: ffmpeg_blit (ffmpeg.c:512)
==1479050==    by 0x4A957ED: ncvisual_blit (visual.c:20)
==1479050==    by 0x4A96C31: ncvisual_render_cells (visual.c:672)
==1479050==    by 0x4A979E8: ncvisual_render (visual.c:900)
==1479050==    by 0x231FB2: _DOCTEST_ANON_FUNC_2() (visual.cpp:589)
==1479050==    by 0x1AD88D: doctest::Context::run() (doctest.h:6414)
==1479050==    by 0x1516C7: main (main.cpp:165)
==1479050== 
RENDERED NCVISUAL 1/1->5/5
dankamongmen commented 3 years ago

we get the same results from 64. i suspect that it was using 192 above due to the 3B pixel size, and 192 being the LCM of 64 and 3.

dankamongmen commented 3 years ago

we're going to need to reproduce this in ncvisual_from_bgra(), lamely

dankamongmen commented 3 years ago

why is this yuv shit involved at all when we're not even changing format? we're already rgba

dankamongmen commented 3 years ago

alright, the invalid reads have been addressed. now what's up with this conditionals on invalid data?

dankamongmen commented 3 years ago

if we're on line 141 of packed_vscale(), it's this:

  } else if (c->yuv2packed1 && lum_fsize == 1 && chr_fsize == 2 &&                                       
               chr_filter[2 * chrSliceY + 1] + chr_filter[2 * chrSliceY] == 4096 &&                        
               chr_filter[2 * chrSliceY + 1] <= 4096U) { // unscaled RGB                                   
        int chrAlpha = chr_filter[2 * chrSliceY + 1];                                                      
        inst->pfn.yuv2packed1(c, (const int16_t*)*src0, (const int16_t**)src1, (const int16_t**)src2,      
                                    (const int16_t*)(desc->alpha ? *src3 : NULL),  *dst, dstW, chrAlpha, sl

yuv2packed1 seems a weird thing to have in our SwsContext*...

dankamongmen commented 3 years ago

ahhh, know the warning about "full chroma h"?

[swscaler @ 0xebafa80] Forcing full internal H chroma due to odd output size
[swscaler @ 0xebafa80] Forcing full internal H chroma due to input having non subsampled chroma

well, down in ff_sws_init_output_funcs(), we have:

    if(c->flags & SWS_FULL_CHR_H_INT) {                                                                    
        switch (dstFormat) {                                                                               
            case AV_PIX_FMT_RGBA:                                                                          
#if CONFIG_SMALL                                                                                           
                *yuv2packedX = yuv2rgba32_full_X_c;                                                        
                *yuv2packed2 = yuv2rgba32_full_2_c;                                                        
                *yuv2packed1 = yuv2rgba32_full_1_c;                                

so no matter our source format(!) we're going to go through this. meanwhile, down in line 2102 we hit yuv2rgb_full_1_c_template(), which very much wants a y, and arrays of u, v, and a. we hit the bad use here:

            if (hasAlpha) {                                                                                
                A = (abuf0[i] + 64) >> 7;          <-------------------------------
                if (A & 0x100)                                                                             
                    A = av_clip_uint8(A);                                                                  
            }                                                                                              
dankamongmen commented 3 years ago
    if (isAnyRGB(dstFormat) && !(flags&SWS_FULL_CHR_H_INT)) {                                                                               
        if (dstW&1) {                                                                                                                       
            av_log(c, AV_LOG_DEBUG, "Forcing full internal H chroma due to odd output size\n");                                             
            flags |= SWS_FULL_CHR_H_INT;                                                                                                    
            c->flags = flags;                                                                                                               
        }                                                                                                                                   

        if (   c->chrSrcHSubSample == 0                                                                                                     
            && c->chrSrcVSubSample == 0                                                                                                     
            && c->dither != SWS_DITHER_BAYER //SWS_FULL_CHR_H_INT is currently not supported with SWS_DITHER_BAYER                          
            && !(c->flags & SWS_FAST_BILINEAR)                                                                                              
        ) {                                                                                                                                 
            av_log(c, AV_LOG_DEBUG, "Forcing full internal H chroma due to input having non subsampled chroma\n");                          
            flags |= SWS_FULL_CHR_H_INT;                                                                                                    
            c->flags = flags;                                                                                                               
        }                                                                                                                                   
    }                                                                                                                                       

so we're getting hit with this twice -- once because our output has an odd width, and once because we're lacking subsampled chroma, whatever the hell that is, i'm not a rich kid.

dankamongmen commented 3 years ago

just lost an hour to preparing patches for libav, which it turns out is abandoned, goddamnit curses you ten thousand times libav

dankamongmen commented 3 years ago

ok anyway YUV2RGBWRAPPER(yuv2, rgb_full, rgba32_full, AV_PIX_FMT_RGBA, 1). that looks to me like it wants 32 pixels at once (128B)? or maybe that it's 32bits per pixel. hrm.

so i'm almost thinking that what we might have to do is do a conversion to yuv (remember, that's the format according to the ncvisual we're loading from a file), because sws_scale() going to rgba from our config assumes yuv. maybe that's why the file-loaded ncvisual is being put in yuv format -- i don't think the PNG is encoding anything that way. hrmmmmmmmmmmm.

if that's the case, i'd think a larger handmade RGBA scaled out through this process would show some distortion, right? since the conversion is expecting YUV (we hypothesize)? so we ought be able to check for that, to confirm this conjecture.

dankamongmen commented 3 years ago

i mean all we have there now in terms of pixel content is 0xffffffff, and i would be entirely unsurprised if that's just a white pixel under both systems.

dankamongmen commented 3 years ago

hrmmm, indeed, this is not what i would expect us to blow up into

2021-05-22-201119_1072x1417_scrot

dankamongmen commented 3 years ago

though....this doesn't seem to be generating any valgrind warnigns? let me shrink it back down and see what happens...very mysterious...

dankamongmen commented 3 years ago

yeah, when we go from a single white pixel, we get the expected 20x20 white block, but we also get a bunch of valgrind warnings. the ffmpeg warnings have also changed

ncplane_new_internal:452:Created new 20x20 plane "cvis" @ 0x0
[swscaler @ 0xeb8f840] Forcing full internal H chroma due to input having non subsampled chroma
[swscaler @ 0xeb8f840] Optimized 2 tap filter code cannot be used
dankamongmen commented 3 years ago

ok, very interesting...i now have 4x4 going to 4x4 and 16x16, with no warnings. the difference was setting all the alphas to 0xff. very interesting!

dankamongmen commented 3 years ago

going to 1x1 still draws problems, though the output looks just fine....=[

dankamongmen commented 3 years ago

i mean, could this possibly be some SIMD operation being performed on uninitialized data, but the output isn't actually used? hrmmm

dankamongmen commented 3 years ago

1x16 gets no warnings...

dankamongmen commented 3 years ago

1x1 with stride of 64 does, though :(. this happens even if i memset the padding down to 0 in ncvisual_from_rgba().

dankamongmen commented 3 years ago

2021-05-22-204809_1072x1417_scrot

output looks great...

dankamongmen commented 3 years ago

ehhhhh...the hard warnings are gone. i think i'm closing this up for now.

dankamongmen commented 3 years ago

oh, might it be that we need to load with av_image_fill_pointers() or something like that? hrmm.

dankamongmen commented 3 years ago

if this was an simd thing, i think we'd see it with the 1x1 image loaded from disc, which we do not. let's leave this open until the warnings are resolved -- all of them.

dankamongmen commented 3 years ago

so, if we convert to RGBA after decoding a frame (from disk), we do generate all these warnings. so yeah, it's either

1) incredibly sloppy work from ffmpeg, but harmless, or 2) a bug in ffmpeg when scaling RGBA

dankamongmen commented 3 years ago

we're down to a single warning test, in Visual::Stretch:

==1618607== Conditional jump or move depends on uninitialised value(s)
==1618607==    at 0x6AE55EB: yuv2rgb_full_1_c_template (output.c:2102)
==1618607==    by 0x6AE55EB: yuv2rgba32_full_1_c (output.c:2142)
==1618607==    by 0x6B20C8C: packed_vscale (vscale.c:141)
==1618607==    by 0x6B0A580: swscale (swscale.c:491)
==1618607==    by 0x6B0B707: sws_scale (swscale.c:969)
==1618607==    by 0x4A5360E: ffmpeg_resize_internal (ffmpeg.c:498)
==1618607==    by 0x4A536D5: ffmpeg_blit (ffmpeg.c:539)
==1618607==    by 0x4A99D8B: ncvisual_blit (visual.c:77)
==1618607==    by 0x4A9B905: ncvisual_render_cells (visual.c:813)
==1618607==    by 0x4A9C6E8: ncvisual_render (visual.c:1043)
==1618607==    by 0x23D4C8: _DOCTEST_ANON_FUNC_2() (visual.cpp:161)
==1618607==    by 0x1A9A2D: doctest::Context::run() (doctest.h:6414)
==1618607==    by 0x154B17: main (main.cpp:166)
==1618607== 
==1618607== Conditional jump or move depends on uninitialised value(s)
==1618607==    at 0x6AE564C: yuv2rgb_write_full (output.c:1856)
==1618607==    by 0x6AE564C: yuv2rgb_full_1_c_template (output.c:2106)
==1618607==    by 0x6AE564C: yuv2rgba32_full_1_c (output.c:2142)
==1618607==    by 0x6B20C8C: packed_vscale (vscale.c:141)
==1618607==    by 0x6B0A580: swscale (swscale.c:491)
==1618607==    by 0x6B0B707: sws_scale (swscale.c:969)
==1618607==    by 0x4A5360E: ffmpeg_resize_internal (ffmpeg.c:498)
==1618607==    by 0x4A536D5: ffmpeg_blit (ffmpeg.c:539)
==1618607==    by 0x4A99D8B: ncvisual_blit (visual.c:77)
==1618607==    by 0x4A9B905: ncvisual_render_cells (visual.c:813)
==1618607==    by 0x4A9C6E8: ncvisual_render (visual.c:1043)
==1618607==    by 0x23D4C8: _DOCTEST_ANON_FUNC_2() (visual.cpp:161)
==1618607==    by 0x1A9A2D: doctest::Context::run() (doctest.h:6414)
==1618607==    by 0x154B17: main (main.cpp:166)
==1618607== 
==1618607== Conditional jump or move depends on uninitialised value(s)
==1618607==    at 0x4A6773E: rgba_trans_p (internal.h:1621)
==1618607==    by 0x4A6773E: rgba_trans_q (blit.c:60)
==1618607==    by 0x4A6773E: tria_blit_ascii (blit.c:99)
==1618607==    by 0x4A536FC: rgba_blit_dispatch (internal.h:1678)
==1618607==    by 0x4A536FC: ffmpeg_blit (ffmpeg.c:545)
==1618607==    by 0x4A99D8B: ncvisual_blit (visual.c:77)
==1618607==    by 0x4A9B905: ncvisual_render_cells (visual.c:813)
==1618607==    by 0x4A9C6E8: ncvisual_render (visual.c:1043)
==1618607==    by 0x23D4C8: _DOCTEST_ANON_FUNC_2() (visual.cpp:161)
==1618607==    by 0x1A9A2D: doctest::Context::run() (doctest.h:6414)
==1618607==    by 0x154B17: main (main.cpp:166)
==1618607== 
==1618607== Conditional jump or move depends on uninitialised value(s)
==1618607==    at 0x239D82: doctest::detail::Result doctest::detail::Expression_lhs<unsigned int const&>::operator==<int>(int const&) [clone .isra.0] (doctest.h:1197)
==1618607==    by 0x23D95C: _DOCTEST_ANON_FUNC_2() (visual.cpp:173)
==1618607==    by 0x1A9A2D: doctest::Context::run() (doctest.h:6414)
==1618607==    by 0x154B17: main (main.cpp:166)