golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.57k stars 17.61k forks source link

cmd/cgo: cgo is broken ("cgo argument has Go pointer to Go pointer") #14426

Closed ghost closed 8 years ago

ghost commented 8 years ago

The implementation of the Cgo pointer passing rules seems to be broken. This Go/cgo program, using ffmpeg via cgo, panics on line 72 at seemingly random timestamps / loop iterations given a path to an .mp4 video file as a first argument. You may need to run it several times to get a panic - about 50% of the runs are successful.

$ go version
go version go1.6 linux/amd64
bradfitz commented 8 years ago

Because @opennota thought "Unplanned" meant we weren't going to fix it, I'll assume the worst that this is a bad regression and tag it Go 1.6.1. I don't know anything other than @ianlancetaylor is the person to evaluate this. I was simply giving it any Milestone so it showed up as looked at.

dominikh commented 8 years ago

For the same input file, it reliably crashes, although it takes a different amount of loop iterations, and changing the code changes the amount of iterations.

https://ffmpeg.org/doxygen/2.8/structAVPacket.html is the definition of the AVPacket struct. The pointer checker is complaining about the priv field, a void*. The values of priv, for my input file, look something like this:

0x400f6f
0x7ffdc93fd4e0
0x7ffdc93fd4e0
0x7ffdc93fd4e0
0x7a4f00

The last value is considered a valid Go pointer, as it points into the bss section.

I haven't looked into the meaning of the priv field or how ffmpeg comes up with the values. However, I do wonder if it demonstrates a more general problem with the way cgo checks for Go pointers. C can put arbitrary values in those fields (or simply not initialize them), and may not necessarily intend to ever access them as pointers.

dwbuiten commented 8 years ago

FFmpeg developer and Gopher here (who uses CGO unfortunately too much).

Can you say which version of FFmpeg you tested this with? The code in av_read_frame may do something akin to *pkt = *queued_packet, which may have set the priv filed to initialized data. That field has been removed, and has long been deprecated for a reason, and I'm not confident that all the internal packet functions properly initialized that field or zeroed the structure — you may be seeing unitialized data there. priv should be NULL in that example (it was mostly only set for e.g. Video4Linux2 capture device). I would love to try and repo/look into it, and save @ianlancetaylor some time if it's our fault. It should be easy to test.

Sorry if this is noise!

ghost commented 8 years ago

@dwbuiten I tested it with 2.6.3 and (just a moment ago) with 2.8.6. Both are crashing.

dwbuiten commented 8 years ago

It looks as if my assumption was correct: priv is uninitialized.

I ported your example to C, added a touch to priv, and ran it under valgrind:

==42866== Conditional jump or move depends on uninitialised value(s)
==42866==    at 0x400AC3: process_pkt (bug.c:7)
==42866==    by 0x400CFB: main (bug.c:74)
==42866==  Uninitialised value was created by a stack allocation
==42866==    at 0x68D6FA8: read_frame_internal (utils.c:1317)

My guess is that the when the packet is statically allocated in the library, it uses some memory that once was used by Go, perhaps, and by chance ends up with a pointer-like value, causing the CGO analysis to catch it and panic.

I will submit for a fix to go into the older releases of FFmpeg. This is not a Go bug.

ghost commented 8 years ago

I added pkt.priv = nil before the call to process_pkt and the crashes are gone.

However, like @dominikh I also wonder if this enforcing of the pointer passing rules will be a source of random, even if rare, crashes.

stephenwithav commented 8 years ago

@opennota, I'm having a similar problem with cgo and ffmpeg (C.sws_scale), so I agree with your concerns.

bradfitz commented 8 years ago

@dwbuiten, thanks for helping.

ianlancetaylor commented 8 years ago

Thanks for looking into this.

The cgo pointer checks should only going to complain about Go pointers (that is, pointers into Go memory) that appear in Go memory. I have not looked at the code, but it sounds like C++ is taking uninitialized memory and copying it into Go memory, and that some of that uninitialized memory has a pointer type. If that is what is happening, then you have a big problem, and it's a good thing that the cgo pointer check is catching it. The garbage collector is not going to work properly if it can see invalid values with pointer types.

dwbuiten commented 8 years ago

@ianlancetaylor This is correct, since he/she is allocating the AVPacket struct in Go. Nothing bad would actually happen in this particular case inside the lib, since priv is unused, but I agree it's bad in general (again, I'll fix it upstream). A very good point about GC; I didn't think of that!

ianlancetaylor commented 8 years ago

OK, closing as not a Go problem.