Closed ghost closed 8 years ago
Because @opennota thought "Unplanned" meant we weren't going to fix it, I'll assume the worst that this is a bad regression and tag it Go 1.6.1. I don't know anything other than @ianlancetaylor is the person to evaluate this. I was simply giving it any Milestone so it showed up as looked at.
For the same input file, it reliably crashes, although it takes a different amount of loop iterations, and changing the code changes the amount of iterations.
https://ffmpeg.org/doxygen/2.8/structAVPacket.html is the definition of the AVPacket struct. The pointer checker is complaining about the priv
field, a void*
. The values of priv
, for my input file, look something like this:
0x400f6f
0x7ffdc93fd4e0
0x7ffdc93fd4e0
0x7ffdc93fd4e0
0x7a4f00
The last value is considered a valid Go pointer, as it points into the bss section.
I haven't looked into the meaning of the priv field or how ffmpeg comes up with the values. However, I do wonder if it demonstrates a more general problem with the way cgo checks for Go pointers. C can put arbitrary values in those fields (or simply not initialize them), and may not necessarily intend to ever access them as pointers.
FFmpeg developer and Gopher here (who uses CGO unfortunately too much).
Can you say which version of FFmpeg you tested this with? The code in av_read_frame
may do something akin to *pkt = *queued_packet
, which may have set the priv
filed to initialized data. That field has been removed, and has long been deprecated for a reason, and I'm not confident that all the internal packet functions properly initialized that field or zeroed the structure — you may be seeing unitialized data there. priv should be NULL in that example (it was mostly only set for e.g. Video4Linux2 capture device). I would love to try and repo/look into it, and save @ianlancetaylor some time if it's our fault. It should be easy to test.
Sorry if this is noise!
@dwbuiten I tested it with 2.6.3 and (just a moment ago) with 2.8.6. Both are crashing.
It looks as if my assumption was correct: priv
is uninitialized.
I ported your example to C, added a touch to priv
, and ran it under valgrind:
==42866== Conditional jump or move depends on uninitialised value(s)
==42866== at 0x400AC3: process_pkt (bug.c:7)
==42866== by 0x400CFB: main (bug.c:74)
==42866== Uninitialised value was created by a stack allocation
==42866== at 0x68D6FA8: read_frame_internal (utils.c:1317)
My guess is that the when the packet is statically allocated in the library, it uses some memory that once was used by Go, perhaps, and by chance ends up with a pointer-like value, causing the CGO analysis to catch it and panic.
I will submit for a fix to go into the older releases of FFmpeg. This is not a Go bug.
I added pkt.priv = nil
before the call to process_pkt
and the crashes are gone.
However, like @dominikh I also wonder if this enforcing of the pointer passing rules will be a source of random, even if rare, crashes.
@opennota, I'm having a similar problem with cgo and ffmpeg (C.sws_scale
), so I agree with your concerns.
@dwbuiten, thanks for helping.
Thanks for looking into this.
The cgo pointer checks should only going to complain about Go pointers (that is, pointers into Go memory) that appear in Go memory. I have not looked at the code, but it sounds like C++ is taking uninitialized memory and copying it into Go memory, and that some of that uninitialized memory has a pointer type. If that is what is happening, then you have a big problem, and it's a good thing that the cgo pointer check is catching it. The garbage collector is not going to work properly if it can see invalid values with pointer types.
@ianlancetaylor This is correct, since he/she is allocating the AVPacket
struct in Go. Nothing bad would actually happen in this particular case inside the lib, since priv
is unused, but I agree it's bad in general (again, I'll fix it upstream). A very good point about GC; I didn't think of that!
OK, closing as not a Go problem.
The implementation of the Cgo pointer passing rules seems to be broken. This Go/cgo program, using ffmpeg via cgo, panics on line 72 at seemingly random timestamps / loop iterations given a path to an .mp4 video file as a first argument. You may need to run it several times to get a panic - about 50% of the runs are successful.