Open wpregliasco opened 3 years ago
Hi there @wpregliasco I'm glad to know the material is useful for you =)
1) Who is sorting the packets/frames ? 2) To render a frame, we need another frames rendered, where are they stored? (if they are, if they are not, seems very innefficient).
1) The decoding process takes care of that reordering. Think of it as if the decoding code was a player, it really needs to have fetch all the required frames to form the current (if it's a B that depends on a previous I/P and references a future P, it needs to fetch both to only then render the B)
2) All the frames are stored into the file, to have this compressed view of the file, you should only read the packets, you can simulate the same behavior using the ffprobe:
# this should show you the compressed frames as they're stored
ffprobe -loglevel panic -select_streams v -show_entries "packet=pts,dts" small_bunny_1080p_60fps.mp4 | less
# or if you're wanting to view a parser-like view of the codec you can use the mediainfo
mediainfo --Details=1 small_bunny_1080p_60fps.mp4 | grep " - slice_type" | less
You must read the packets because once you decode them, the CODEC does the magic :)
you could change your hello world to show how the packets are stored at the file.
while (av_read_frame(pFormatContext, pPacket) >= 0)
{
// if it's the video stream
if (pPacket->stream_index == video_stream_index) {
//https://ffmpeg.org/doxygen/trunk/structAVPacket.html
logging("pts %" PRId64, pPacket->pts);
logging("dts %" PRId64, pPacket->dts);
}
av_packet_unref(pPacket);
}
wow ! I finally understood. Just to clarify to others (and myself) this is my story: I modified the hello world example as you suggest, and the output is:
(...)
LOG: AVPacket->pts 0
LOG: ->dts -512
LOG: AVPacket->pts 1024
LOG: ->dts -256
LOG: AVPacket->pts 512
LOG: ->dts 0
LOG: Frame 1 (type=I, size=100339 bytes, format=0) pts 0 key_frame 1 [DTS 0]
LOG: AVPacket->pts 256
LOG: ->dts 256
LOG: Frame 2 (type=B, size=7484 bytes, format=0) pts 256 key_frame 0 [DTS 3]
LOG: AVPacket->pts 768
LOG: ->dts 512
LOG: Frame 3 (type=B, size=14764 bytes, format=0) pts 512 key_frame 0 [DTS 2]
LOG: AVPacket->pts 2048
LOG: ->dts 768
LOG: Frame 4 (type=B, size=7058 bytes, format=0) pts 768 key_frame 0 [DTS 4]
LOG: AVPacket->pts 1536
LOG: ->dts 1024
LOG: Frame 5 (type=P, size=37353 bytes, format=0) pts 1024 key_frame 0 [DTS 1]
LOG: AVPacket->pts 1280
LOG: ->dts 1280
LOG: Frame 6 (type=B, size=8678 bytes, format=0) pts 1280 key_frame 0 [DTS 7]
LOG: releasing all the resources
That we can arrange in the table (ts = timescale = 1/256) | Packet | Frame | |||
---|---|---|---|---|---|
dts*ts+2 | pts*ts | DTS | pts*ts | # type | |
0 | 0 | - | - | ||
1 | 4 | - | - | ||
2 | 2 | 0 | 0 | 1I | |
3 | 1 | 3 | 1 | 2B | |
4 | 3 | 2 | 2 | 3B | |
5 | 8 | 4 | 3 | 4B | |
6 | 6 | 1 | 4 | 5P | |
7 | 5 | 7 | 5 | 6B | |
. | (6) | (6) | (7B) | ||
. | |||||
. | (5) | (8) | (9P) |
when you send the first two packets to the codec, there is no output frame, and avcodec_receive_frame
returns AVERROR(EAGAIN)
. Since then, each packet we send to codec, we receive a frame in return but does not correspond to the sended packet. Codec is buffering the information to generate the frames sorted by pts.
As a consecuence, codec needs the last two packets (in parenthesis) to render last frame #6.
I have no more to add, thanks and thanks again for you time. Willy
thanks for sharing @wpregliasco you can also notice that avframe has an attribute to show bitstream order. :)
Hi, Thanks for your clear and deep explanations, I finnaly understood a lot of dark items in ffmeg !!
Still I have a silly question from the hello world example:
The question is (actually I have two questions):
Thanks again for sharing your knowledge. Regards, Willy