FernetMenta / xbmc

Fork of XBMC Main PVR Development Repository.
http://xbmc.org
Other
55 stars 20 forks source link

Refresh-rate switching happens late where variable frame rate #52

Open StrangeNoises opened 12 years ago

StrangeNoises commented 12 years ago

Mostly pasted from the offtopic post to the forum yesterday. :-) NB: This is now repeated with the version just now updated from https://launchpad.net/~wsnipex/+archive/xbmc-xvba-testing - which internally shows to be 12.0-ALPHA3 GIT:Unknown) but which apt reports as:

Version: 2:12.0~git20120708.2243-c6fc742-0precise

Issue is with auto-switching of refresh rates: Some videos for which the refresh rate should be changed, do not trigger that refresh rate change until after a few seconds of playback. The pattern I've been able to find so far is that this only seems to affect videos marked as having a variable frame rate.

(Handbrake uses a variable frame rate as part of its High Profile preset; and has done for at least a couple of years. Even if the frame rate isn't actually going to vary.)

Thing is, at least according to mediainfo, a headline refresh rate is provided anyway, so surely even if it is marked as variable, it should start with the given frame rate as a starting presumption? I think older versions - in particular up to 11.0 eden - did do this as I'm only starting to see this problem now. I had been seeing it on only one fairly old encode before, where that headline frame rate was actually listed (oddly) as 24.998fps; whereas most PAL-type videos show exactly 25.000fps. So even on eden that video started, then changed refresh rate (actually just back to the same one as it's the closest). But with the new build, it's doing it for all such variable-frame-rate recordings.

After that initial refresh rate change, playback continues normally for the duration. It's just that it seems to be waiting to see if it really needs to change refresh rate for such videos. Obviously this is pretty disruptive to the viewing experience.

edit: context:

System is Ubuntu 12.04 64-bit installed on 2010 mac mini server connected by HDMI to a Panasonic Viera TV.

The only deviations from the Ubuntu standard packages are the ppas: ubuntu-x-swat/x-updates (for nvidia 302.17) and wsnipex-xbmc-xvba-testing (for this build of xbmc)

Everything's up to date.

XBMC log of session is here: http://xbmclogs.com/show.php?id=4704

The relevant part (where the refresh rate changes) is at lines 320-331.

mediainfo on the track being played in the log: http://xbmclogs.com/show.php?id=4705

StrangeNoises commented 11 years ago

that atom is actually dual-core but iirc all atoms also have hyperthreading, whereas the core 2 duo doesn't; so despite your lower clock speed you probably will get better threading than me. Linux will see four cores and knows what to do with them. :-)

Can't believe this mac is so obsolete already! :-( (well, it's fine if i decide to be happy with vdpau-bob...)

StrangeNoises commented 11 years ago

Temporal/Spatial (half) is within its limits and looks pretty good for interlaced but not telecined sources (which show jerky motion). not sure it's an improvement on bob even on the others. It probably is for content that doesn't really need deinterlacing anyway (ie: bbc's propensity for broadcasting and shipping on bluray everything interlaced even though it's probably filmed on progressive.)

FernetMenta commented 11 years ago

It's not the hyperthreading. The following qvdpautest is from a Zotac ID11, ION2 with Atom D525:

qvdpautest 0.5.1 Intel(R) Atom(TM) CPU D510 @ 1.66GHz NVIDIA GPU ION (GT218) at PCI:3:0:0 (GPU-0)

VDPAU API version : 1 VDPAU implementation : NVIDIA VDPAU Driver Shared Library 295.40 Thu Apr 5 22:02:06 PDT 2012

SURFACE GET BITS: 193.721 M/s SURFACE PUT BITS: 161.867 M/s

MPEG DECODING (1920x1080): 68 frames/s MPEG DECODING (1280x720): 161 frames/s H264 DECODING (1920x1080): 66 frames/s H264 DECODING (1280x720): 122 frames/s VC1 DECODING (1440x1080): 83 frames/s MPEG4 DECODING (1920x1080): 67 frames/s

MIXER WEAVE (1920x1080): 483 frames/s MIXER BOB (1920x1080): 675 fields/s MIXER TEMPORAL (1920x1080): 190 fields/s MIXER TEMPORAL + IVTC (1920x1080): 121 fields/s MIXER TEMPORAL + SKIP_CHROMA (1920x1080): 254 fields/s MIXER TEMPORAL_SPATIAL (1920x1080): 65 fields/s MIXER TEMPORAL_SPATIAL + IVTC (1920x1080): 53 fields/s MIXER TEMPORAL_SPATIAL + SKIP_CHROMA (1920x1080): 72 fields/s MIXER TEMPORAL_SPATIAL (720x576 video to 1920x1080 display): 230 fields/s MIXER TEMPORAL_SPATIAL + HQSCALING (720x576 video to 1920x1080 display): 121 fields/s

MULTITHREADED MPEG DECODING (1920x1080): 70 frames/s MULTITHREADED MIXER TEMPORAL (1920x1080): 165 fields/s

It plays 1080i@50 temporal/spatial without problems. Again the multi-threaded behavior is much better than on your GT320M. Either the nvidia driver is not optimized for gt320m, it's the chip itsel, or the way Apple has wired it. Don't know. Even on my outdated mobo with an GT9300 onboard graphics I can play 1080i temporal without problems.

I have a couple of test systems here sponsored by Zotac. My favorite is the ID80. I do most of my development on this machine and I recommend it to my buddies if they want to build a HTPC. I am going to replace my living room system with is a custom build. Will use Zotac D-2700-itx board which is basically the same as ID80 but has a pci-e slot I can use for a tv card.

StrangeNoises commented 11 years ago

i might go the mini-itx version too with that equivalent board; just looking it up now. not afraid of self-builds, so if i can make it cheaper without making it horrid that way i might. (zotac's own case isn't exactly a looker after all; not for someone used to macs...)

Will be playing with shopping baskets now for the rest of the day i expect ;-) but there's no great rush as vdpau-bob will keep me satisfied for the time being, until the quest for perfection outweight budgetary considerations anyway. :-)

we're way offtopic for this issue now of course. :-)

FernetMenta commented 11 years ago

we're way offtopic for this issue now of course. :-)

Well, this is my repo and nobody should blame us :)

BTW: how much memory have you configured for the GPU?

I am going to use Steacom FC8 as a case. Accoring to the spec the Zotac board does not fit but I hope I can make it fit somehow :)

StrangeNoises commented 11 years ago

i haven't explicitly set any memory config on the GPU. Not sure how (about to look at nvidia-settings). If it's a bios thing it probably won't apply on a mac.

nvidia-settings says it has 256MB.

FernetMenta commented 11 years ago

This might be the root cause. It should have 512MB. Don't you have a BIOS setting to configure GPU ram?

StrangeNoises commented 11 years ago

it's a mac. it has an apple-flavoured efi and a basic bios emulation mode. :-) The more memory it has the more it gives to the GPU, automatically, but this seems to top out at 256MB. (The machine has 8GB RAM total.) Am still googling for ways to take more control of that but not sure such ways exist. In between reading zotac id80 reviews and costing things. so far i've got a mobo+case combo that's only 80 pence cheaper than the cheapest id80 i've found... :-}

FernetMenta commented 11 years ago

http://support.apple.com/kb/SP585 NVIDIA GeForce 320M graphics processor with 256MB of DDR3 SDRAM shared with main memory

Well done Apple :( Designed for the wrong planet.

FernetMenta commented 11 years ago

weird:

Memory available to Mac OS X may vary depending on graphics needs. Minimum graphics memory usage is 256MB

would it increase VRAM if you increased RAM?

StrangeNoises commented 11 years ago

it does, but it has already, given this machine has 8GB; apparently 256MB is the maximum it goes to. I can't find anything that lets me override that.

FernetMenta commented 11 years ago

I would have done a qvdpautest on a machine with 256MB vram but I can't reduce on my systems. I think if you can manager it to get have more vram available running Linux it will boost performance.

StrangeNoises commented 11 years ago

i think i shall just admit that vdpau-bob is the best i'll get on this system - and as i haven't actually seen better, I won't feel that I'm missing out. And when my bank balance is a bit better (telling myself not to spend money I can avoid until I can end a month without an overdraft), I expect I'll get a Zotac ID80.

StrangeNoises commented 11 years ago

hmm. i have a GT218 in my "big box" core-i7x4 pc that I suddenly no longer use for video encoding. It'll be a bit of a chore to lug it downstairs, and it's a bit noisy for a living room (though not too bad considering it's got 6 mechanical hard drives in it; it's in a good quiet case). I had just presumed the GT218 wouldn't be sufficiently capable but you're suggesting it is (which makes the 320M not being more of a surprise), so I can try this out for zero cost. Then of course if Temporal/Spatial deinterlacing really is that much better i won't be able to go back to just bob, and I'll have to buy a more sensible living-room box. :-}

That machine is already running the quantal alpha but it doesn't really need to, so if the precise builds for xbmc-xvba-testing don't work out I'll just have to reinstall... Well, that's my evening sorted. Phew. :-)

i'll run the qvdpautest on it before trying to move it downstairs...

FernetMenta commented 11 years ago

Well, your mac mini has a core duo. Have you tried software decoding with yadif de-interlacer?

I have googled a bit but I have to admin that (U)EFI is greek to me. Nevertheless I do think it should be possible to hack it to get more vram.

StrangeNoises commented 11 years ago

hm. the version of qvdpautest you linked doesn't build on quantal. the up to date version builds but apparently hangs during its first test.

i'm fairly sure when i tried before the mini wasn't up to doing the decodes in software only; but it was long enough ago that i'm not sure now what i tried and software updates might have improved things since anyway.

trying to get xbmc working; it's being recalcitrant...

StrangeNoises commented 11 years ago

testing your build of xbmc (same as i've been using) on the "big pc" (core-i7x4 3.07GHz; ubuntu quantal alpha 64-bit (includes nvidia 302.17), GT218) - albeit just on a normal 1080p monitor fixed at 60Hz (dell u2311):

I seem to be getting exactly the same behaviour even though it's quite a different card. That is, Temporal marginally works on some sources (some 1080i@50), and suffers significant frame-drops on others (1080i@50 and 1080i@60), with the threshold between the two seemingly about the same; Temporal/Spatial just loses lots of frames on everything, same as the GT320M.

Of course this machine has shedloads of cpu power, so turning VDPAU off completely works pretty well too; with the default deinterlacing method seemingly working well, though not sure what that is. I suspect it's bob, as it seems to just work on everything. :-)

"Deinterlace" (whatever that is; user interface doesn't specify) works at 1080i@50 but loses frames at 1080i@60. When I look at the deinterlacing options, "yadif" isn't there. "bob" is of course, and works as well as it does anywhere. "weave" is present, but I still end up with plenty of visible combing artifacts so it's presumably not doing what I hoped.

(I also note during playback with these deinterlacing options enabled, only one CPU core seems to be at work. Oh actually it's the same during vdpau playback too so I guess that's not playback-related.)

I'd probably have to downgrade this machine back to precise to eliminate any quantal oddities and run qvdpautest; but I'm not going to take that plunge today.

FernetMenta commented 11 years ago

I think "deinterlace" is yadif but not completely sure. I haven't done much with software decoding. How is the quility with this option?

How much video ram does that GT218 have? 512 or less?

Do you boot Linux on the mac in efi or bios mode? Does NVidia driver work in efi mode? Sooner or later that efi stuff will hit us all.

StrangeNoises commented 11 years ago

the binary of qvdpautest built on the mac runs on the quantal pc:

qvdpautest 0.5.1 Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz NVIDIA GPU GeForce 210 (GT218) at PCI:3:0:0 (GPU-0)

VDPAU API version : 1 VDPAU implementation : NVIDIA VDPAU Driver Shared Library 302.17 Tue Jun 12 16:27:11 PDT 2012

SURFACE GET BITS: 1144.14 M/s SURFACE PUT BITS: 1186.05 M/s

MPEG DECODING (1920x1080): 72 frames/s MPEG DECODING (1280x720): 162 frames/s H264 DECODING (1920x1080): 64 frames/s H264 DECODING (1280x720): 130 frames/s VC1 DECODING (1440x1080): 83 frames/s MPEG4 DECODING (1920x1080): 72 frames/s

MIXER WEAVE (1920x1080): 288 frames/s MIXER BOB (1920x1080): 477 fields/s MIXER TEMPORAL (1920x1080): 135 fields/s MIXER TEMPORAL + IVTC (1920x1080): 88 fields/s MIXER TEMPORAL + SKIP_CHROMA (1920x1080): 179 fields/s MIXER TEMPORAL_SPATIAL (1920x1080): 60 fields/s MIXER TEMPORAL_SPATIAL + IVTC (1920x1080): 46 fields/s MIXER TEMPORAL_SPATIAL + SKIP_CHROMA (1920x1080): 67 fields/s MIXER TEMPORAL_SPATIAL (720x576 video to 1920x1080 display): 211 fields/s MIXER TEMPORAL_SPATIAL + HQSCALING (720x576 video to 1920x1080 display): 127 fields/s

MULTITHREADED MPEG DECODING (1920x1080): 62 frames/s MULTITHREADED MIXER TEMPORAL (1920x1080): 92 fields/s

The decoding figures are slightly better but the mixer and multithreaded tests are still significantly worse, even on this beast of a machine. That said it looks like it ought to be able to do temporal, though temporal/spatial at 60 fields/s is touch-and-go.

FernetMenta commented 11 years ago

Hmm, looking at this numbers I would expect temporal to work with all media. Does this card have 512MB?

I will install the ppa version on my ION2 tomorrow an run some tests.

StrangeNoises commented 11 years ago

Making my eyes bleed (metaphorically) staring closely at the monitor the way i won't at a tv; i think "deinterlace" might be marginally better. Not so much as to seem worth the effort. :-)

Using that method on the 1080i@60 sources gets me lots of frame drops. What I see during this time is that one CPU core is working harder while the rest are idle, so the process presumably isn't well multithreaded. But even then it's not exactly thrashing its heart out. It varies a lot but I don't see it exceeding 90%, and the frame drops don't seem correlated at all to when it peaks.

I just had a thought: The 1080i@60 sources I speak of are telecined at source: There was a period when the BBC, who make all their material at 1080i@50, were mastering blurays with the videos telecined up to 1080i@60, presumably for an international market. They don't seem to do it any more, thankfully, at least for the ones they sell here, but they made some otherwise very nice bluray products during that period. Doctor Who season 4 specials, David Tennant's Hamlet, Ganges, Cranford, and some others, that I'm motivated to have play well.

I've never been able to adequately detelecine them so eventually I just decided to preserve the interlacing and let them play as intended, at which - with bob at least - they play well. But I wonder if that might be giving grief to some of the more advanced deinterlacing methods.

Video RAM in the GT218: 1024MB.

Linux on Mac is booted in BIOS mode. The steps to make it boot in EFI mode are tricky and hard to back out of if it goes wrong, and it was such a nightmare getting Linux onto that machine in the first place (mostly in being able to boot an installer) that I don't want to risk having to do it again!

StrangeNoises commented 11 years ago

back downstairs and with vdpau off the mac mini definitely doesn't have the oomph. serious frame loss on bob-deinterlace of 1080i@50 correlated with a CPU core hitting 100% a lot. The other core's usually pretty quiet so that might be fixed with more multithreading in the code maybe... ;-) But even so we're still just talking about bob, which is a solved problem with vdpau. "Deinterlace" is worse in the same way (more consistently losing frames, more consistently with one core stuck at 100%).

FernetMenta commented 11 years ago

ffmpeg supports multi-threaded decoding but we have not activated it yet. It breaks the hw decoders.

I have no 1080i@60 video which is telecined using 3:2 puldown but AKAIK this should play as 23.97. vdpau can do IVTC when enabled in advanced settings but this will eat additional resources. It just tries to detect if 2 fields are taken from the same frame. If so, it disables de-interlacing. This setitngs only works with temporal or temporal/spatial.

Will try to record a sample e.g. a football game for you to test this week.

My ID11 (ION2) is very close to its limits when setting to temporal/spatial. Sometimes when bringing up the codec or info screen it start dropping. I am still confused by your qvdpautest of GT218.

StrangeNoises commented 11 years ago

ah you misunderstand; the 1080i@60 video isn't telecined up from 23.97 using 3:2 pulldown. this would be easy to deal with; i'd detelecine at encode stage; as most of these are also VC1 and thus can't be played in the raw anyway. It's telecined up from 25fps content that may or may not originally have been interlaced in its own right. It's hideous. So far the only way I've been able to play them back without a sickening jerky motion is to just preserve the interlacing as is and play with bob-deinterlace. Any other method has to work at least as well as that. :-) So far Temporal sorta-justabout works for me on at least some 1080i@50 content but fails badly at these telecined 1080i@60.

I can upload an example or two...

FernetMenta commented 11 years ago

I can upload an example or two...

Please do

VC1 and thus can't be played in the raw anyway

We do support VC1, don't we?

StrangeNoises commented 11 years ago

not if it's interlaced. :-) All this is all about interlaced stuff. I have a load of VC1-progressive movies that play perfectly, yes.

It's coming. I see it now tries and fails badly rather than refuses to try, but as things stand I have to encode VC1-interlaced content using RipBot264 on Windows (for access to the windows media codecs - it's essentially a front-end to avisynth and friends and x264). But I do so preserving the interlacing of the original. When ffmpeg can deal with VC1-interlaced by itself, I'll be ditching the encodes in favour of the raws, but I imagine this issue will still exist because the interlacing/telecining is the same.

Examples are uploading. My uplink is slow so it's taking a while (currently saying ETA 2 hours 13). I'll put a link here when it's up. It's four 2-minute clips from the start of the videos I've been testing with all this while.

fritsch commented 11 years ago

@VC-1: We should support interlaced here. At least, we do the mapping: pic_descriptor->sps_info.vc1.interlace = v->interlace;

StrangeNoises commented 11 years ago

unless the required changes to make it work have been added to the build in the last few days i assure you it doesn't work. :-) VC1-interlaced support in ffmpeg has been a bugbear for years. It looks like there has been progress recently, and as I said, the above mapping is probably why it even tries, but it doesn't work. Sound works, but there's massive corruption on screen. The same with other ffmpeg-dependent applications, eg: VLC and Handbrake which will now often crash when trying to play VC1-interlaced instead of just refusing to touch it; but sometimes work just well enough for you to be able to see what's playing, while not actually playing it in a watchable state.

I can add a raw VC1-interlaced example too if you like; one of the ones from which one of the other examples i'm already uploading is derived.

StrangeNoises commented 11 years ago

confirmed; current build in this branch (Jul 13 2012) doesn't play them properly, but does try. Looks better with VDPAU off (on sufficiently powerful machine) but only manages about 8fps on a core-i7x4@3.07GHz and I think it's confused about fields. With VDPAU on screen mostly black with glitches and blocks appearing top-left.

fritsch commented 11 years ago

One example to reproduce it would be enough. I think that it does not get the mapping right. Sometimes it is not enough to fetch the data out of the VC1-Context, rather one has to look through the mpeg context.

Ouhh - sorry. I am off topic, I thought about XVBA, sorry.

StrangeNoises commented 11 years ago

it's all right, this issue went way off its own topic some time ago. FernetMenta seems happy with it. :-)

But yeah, this is an ffmpeg decoding issue i wasn't expecting to get addressed here. :-)

fritsch commented 11 years ago

Hehe, it is nice to follow this thread, as it reveals a lot of knowledge concerning the different nvidia chips. I only read with one eye about the VC-1 Codec and did not remember the hwaccel anymore - so you got an answer for XVBA.

Thx for the testfiles anyways - I will try them on AMD xvba.

FernetMenta commented 11 years ago

I checked ffmpeg repo. There's been some patches regarding vc1 de-interlacing recently. I can try to backport them as soon as I have a sample.

StrangeNoises commented 11 years ago

Uploading is progressing at a stately but fairly steady 113K/s :-|

StrangeNoises commented 11 years ago

uploaded: http://strangenoises.org/~rachel/xbmctesting/deinterlace-testing/ - hopefully filenames are self-explanatory without being too long.

fritsch commented 11 years ago

@StrangeNoises: Thx for the samples. On XVBA all work fine - with the XVBA bob deinterlacer. But VC-1 seems not to be recognized as Interlaced. I think every frame is taken as a progressive one, but only displayed every second. It looks a bit strange.

I will step into it and have a look.

StrangeNoises commented 11 years ago

they're all fine with the VDPAU-Bob deinterlacer too (except vc1-interlaced). We've been looking at getting Temporal an Temporal/Spatial interlacing going too, which I'm given to understand is superior. :-)

I expect the vc1-interlaced test is waiting for ffmpeg's support for it as an input format to be completed. Of course, if they think it is completed, give them that to chew on. :-) (It definitely is VC1, and it definitely is interlaced, and it definitely is a raw rip from a published blu-ray disk - and all the others I've tried show the same problems.) Meanwhile the DirectShowSource AVISynth plugin on Windows can read it, which is how I got the encoded version.

fritsch commented 11 years ago

@StrangeNoises: Yeah it is superior. It takes spatial and temporal frame information into account to restore one picture. Bob just sucks in comparison :-). I am recompiling my tree now and see what flags are set in the VC1 Structs - if this is the problem. Interlaced VC-1 one does not get this every day.

fritsch commented 11 years ago

Mmh. Seems a bit difficult for me. I tried all patches concerning VC1 till April 2012. They apply cleanly if you cherry-pick them in the right order. But apparantly this does not fix the issue. I am not sure about it.

FernetMenta commented 11 years ago

@StrangeNoises Thanks for the samples. I had a brief look into it while packing things together, will be traveling tomorrow and Wednesday. It does not even work for software decoding. Looks like the standard is not interpreted correctly.

@fritsch Time for you to get out of bed, you have to solve a problem :)

fritsch commented 11 years ago

@FernetMenta: Hehe, just thought to keep it to the ffmpeg people. If you try it in VLC you get a funny error: [vc1 @ 0x7f979cc20860] Interlaced frames/fields support is incomplete [vc1 @ 0x7f979cc20860] concealing 0 DC, 0 AC, 0 MV errors [vc1 @ 0x7f979cc20860] concealing 0 DC, 0 AC, 0 MV errors [vc1 @ 0x7f979cc20860] concealing 0 DC, 0 AC, 0 MV errors

followed by coredump.

So I think the support in ffmpeg is not there yet. http://ffmpeg.org/trac/ffmpeg/ticket/275 - But some patches are on the way, not complete yet.

Edit: Same with mplayer and totem - core dumped. I rather stay in bed. xbmc at least does not crash, that is fine.

StrangeNoises commented 11 years ago

(i did say that; i wasn't looking for a vc1-interlaced fix here; it's just a bit of a digression...) :-)

fritsch commented 11 years ago

@StrangeNoises: Yeah, that is clear to me - but I looked for something to play with. There is not so much left, what can be improved in XVBA Code ... as new specs are missing and fernetmenta is much too fast while solving bugs.

FernetMenta commented 11 years ago

I think the problem with vc-1 is, that fields are encoded as pictures (true interlaced). Hence we need to divide surface height by 2 and don't expect both fields in a pic/frame.

FernetMenta commented 11 years ago

Apart from the vc1 sample all others play with temporal/spatial on my ID80. This one 1080i-50-raw-dvbs-001 has a clean refresh rate set and pullup correction is able to detect fps. For the others you need to set fpsdetect to 2 (advanced settings) to get fps detected for patterns > 1. But this step was not required on my system to have them play smoothly.

StrangeNoises commented 11 years ago

about to disappear out for the evening, but before i go: reminder you were also going to test it on your gt218 system. I am prepared to pay for a GT520 (should be equiv to what's in the ID80) for my 'big box' and bring that downstairs - as a cheaper short-term solution than buying an id80 - but given the disparity in the GT218 qvdpautest results (above) just want to make sure there's not something else going on that's degrading the performance of my gfx cards below what we'd expect (in which case a GT520 probably wouldn't help).

StrangeNoises commented 11 years ago

(for instance i tend to default to installing 64-bit linux on machines that are capable of it - but maybe 32-bit is better?)

FernetMenta commented 11 years ago

I thought 32bit is history. At least I have given up installing/testing 32bit Linux.

StrangeNoises commented 11 years ago

ok, we can eliminate that then. :-) i was just wondering given xbmcbuntu images are 32-bit iirc. and we are dealing with a closed binary blob of code. but if we're all on 64-bit and you get better playback than i do on a machine with the same gfx (gt218) (and otherwise in fact much slower machine than mine) then there's something going on...

FernetMenta commented 11 years ago

ION2 is close to its limits playing 1080i50 temporal/spatial. It won't cope with 1080i@60. But sure, I will do the tests and I owe you a sample of 1080i@50 with fast moving scenes.