Elyotna / linux

Hacking in a V4L2 M2M decoder for AMLogic SoCs
Other
18 stars 4 forks source link

S905 4k issue #2

Open gcsuri opened 5 years ago

gcsuri commented 5 years ago

Hi Maxime,

I have made some decode test on my boards. Beelink MiniMX is a p200 based TV Box. Has 1GB RAM. I'm using 150balbes kernel 5.0.2+. I tried some jellyfish test video and 4k uhd decode failed: [ 175.250150] cma: cma_alloc: alloc failed, req-size: 1016 pages, ret: -12 [ 175.282979] cma: cma_alloc: alloc failed, req-size: 2032 pages, ret: -12 [ 175.284084] WARNING: CPU: 1 PID: 954 at mm/page_alloc.c:4529 alloc_pages_nodemask+0x8a0/0xc84 [ 175.292663] Modules linked in: xt_tcpudp ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat iptable_nat nf_nat_ipv4 nf_nat iptable_mangle iptable_raw iptable_se curity nf_conntrack nf_defrag_ipv4 ip_set nfnetlink ebtable_filter ebtables iptable_filter brcmfmac brcmutil cfg80211 rfkill snd_soc_meson_aiu_spdif snd_socmeson aiu_i2s crct10dif_ce meson_vdec videobuf2_dma_contig v4l2_mem2mem videobuf2_memops videobuf2_v4l2 videobuf2_common meson_rng rng_core dw_hdmi_cec ao_cec videodev p wm_meson meson_ir snd_soc_meson_audio_core media scpi_hwmon adc_keys input_polldev lz4 lz4_compress zram sch_fq_codel ip_tables x_tables [ 175.346740] CPU: 1 PID: 954 Comm: ffmpeg Not tainted 5.0.2+ #5 [ 175.352514] Hardware name: Beelink MiniMX gxbb TV Box (DT) [ 175.357950] pstate: 20000005 (nzCv daif -PAN -UAO) [ 175.362696] pc : alloc_pages_nodemask+0x8a0/0xc84 [ 175.367524] lr : dma_direct_alloc_pages+0xc4/0x204 [ 175.372437] sp : ffff0000104fb6b0 [ 175.375715] x29: ffff0000104fb6b0 x28: ffff800002de3800 [ 175.380976] x27: ffff000011ecd000 x26: 00000000000007f0 [ 175.386237] x25: ffff000011dd9000 x24: 0000000000000005 [ 175.391498] x23: 00000000007f0000 x22: 00000000ffffffff [ 175.396759] x21: 0000000000000000 x20: ffff8000261b2c10 [ 175.402021] x19: 000000000000000b x18: ffff000011ecd000 [ 175.407282] x17: 0000000000000001 x16: 0000000000000000 [ 175.412543] x15: 0000000000000068 x14: 00000000007f0000 [ 175.417805] x13: ffff000011cfb408 x12: 2c4756252cc39200 [ 175.423066] x11: ffff0000104fb4a0 x10: 0000000000000174 [ 175.428327] x9 : ffff000011ee66a8 x8 : 32312d203a746572 [ 175.433588] x7 : 202c736567617020 x6 : ffff800026bbf7d0 [ 175.438849] x5 : 0000000000000000 x4 : 0000000000000000 [ 175.444111] x3 : 0000000000000000 x2 : ffff0000104fb7d0 [ 175.449372] x1 : ffff0000104fb7c0 x0 : 00000000006000c4 [ 175.454634] Call trace: [ 175.457051] alloc_pages_nodemask+0x8a0/0xc84 [ 175.461535] dma_direct_alloc_pages+0xc4/0x204 [ 175.466107] arch_dma_alloc+0x5c/0x1a4 [ 175.469815] dma_direct_alloc+0x20/0x28 [ 175.473610] dma_alloc_attrs+0x78/0xf8 [ 175.477324] vb2_dc_alloc+0x6c/0x120 [videobuf2_dma_contig] [ 175.482846] vb2_queue_alloc+0x1bc/0x5d0 [videobuf2_common] [ 175.488535] vb2_core_reqbufs+0x23c/0x43c [videobuf2_common] [ 175.494145] vb2_reqbufs+0x4c/0x5c [videobuf2_v4l2] [ 175.498975] v4l2_m2m_reqbufs+0x30/0x5c [v4l2_mem2mem] [ 175.504060] v4l2_m2m_ioctl_reqbufs+0x14/0x1c [v4l2_mem2mem] [ 175.509691] v4l_reqbufs+0x4c/0x58 [videodev] [ 175.513992] video_do_ioctl+0x1f4/0x3f0 [videodev] [ 175.518908] video_usercopy+0x23c/0x4e0 [videodev] [ 175.523652] video_ioctl2+0x14/0x1c [videodev] [ 175.528051] v4l2_ioctl+0x3c/0x5c [videodev] [ 175.532263] do_vfs_ioctl+0xb8/0x8d4 [ 175.535796] ksys_ioctl+0x78/0xa8 [ 175.539074] arm64_sys_ioctl+0x1c/0x28 [ 175.542958] el0_svc_common+0xb0/0x100 [ 175.546664] el0_svc_handler+0x70/0x88 [ 175.550373] el0_svc+0x8/0xc [ 175.553218] ---[ end trace eb7fb4ee44870c16 ]---

this video is decoded successfully on my S905X box with the same kernel and same environment.

best regards, Gabor

Elyotna commented 5 years ago

Hi gcsuri, it looks like you're running out of CMA memory that is used to allocate video buffers. What amount do you have ? (trying doing dmesg|grep -i cma)

Are you trying to decode 10-bit video ? There is a limitation currently where the decoder requires 2x the amount of RAM to decode 10-bit video, as we need to decode the 10-bit video in a set of buffers, and then a second set of buffers to store the 8-bit downsampled result (since we do not have a full v4l2->userspace->drm pipeline for 10-bit buffers). As such, decoding 10-bit 4K video with 1GB of ram is difficult.

gcsuri commented 5 years ago

Hi Maxime,

thank you for your answer!

dmesg|grep -i cma :

[ 0.000000] OF: reserved mem: failed to allocate memory for node 'linux,cma' [ 0.000000] cma: Reserved 256 MiB at 0x0000000028000000 [ 0.000000] Memory: 595600K/896000K available (10238K kernel code, 884K rwdata, 3732K rodata, 576K init, 632K bss, 38256K reserved, 262144K cma-reserved)

Yes, I tried 10bit uhd video. I tried h264 uhd also but I got: [ 238.595350] meson-vdec c8820000.video-decoder: Unsupported video width: 3840 [ 238.596788] meson-vdec c8820000.video-decoder: Aborting decoding session!

best regards, Gabor

Elyotna commented 5 years ago

Hey Gabor,

Yeah, 256MiB is going to be a bit short. I'd say safe values are 384MiB for 8-bit UHD, 896MiB for 10-bit UHD. There will be work to lower these values in the future.

Another thing to note: my driver does not support H264 UHD for GXBB (S905). Amlogic had to go through various lengthy hacks to get it supported and I decided not to focus on it for now. If you want H264 UHD you'll have to settle for at least GXL (S905X).