Closed GoogleCodeExporter closed 8 years ago
Haven't found the reason that causes the different. But after some performance
tuning, even the 10-bit 1080p embeded with the unsorted versions above can be
played smooth on my system. Here's a *preview* version. I'll try to make a new
version this weekend (not sure if I can).
Original comment by YuZhuoHu...@gmail.com
on 28 Oct 2011 at 3:20
Attachments:
Original comment by YuZhuoHu...@gmail.com
on 28 Oct 2011 at 3:36
Unfortunately preview xy_vsfilter_test_20111028 version doesn't seem to help at
all with the unsorted script problem.
The MKVs with the sorted sorted scripts play smoothly in madVR, with minimal
slow-down.
The MKVs with the unsorted scripts have 100+ dropped frames in madVR, and are
unplayable.
Another thing I forgot to mention, is this problem is much more noticeable
using madVR. Using VMR9 it still happens, but for whatever reason VMR9
sometimes renders at a lower frame-rate in slow-motion rather than dropping
frames.
It's possible this is an AMD Athlon 64 architecture specific problem, if you
can't reproduce it on your Intel (I'll test on my secondary Intel i5 system
later). There was another AMD-only slowdown issue like this with VSFilter 2.39
in the past with multiple lines, blur, and be. Nobody every figured out what
the problem code was, but somehow enabling all compiler optimizations fixed it.
This seem completely unrelated to that past issue, but it's very possible there
is still some lingering code in VSFilter which runs great on Intel and horribly
on AMD.
Maybe I should upload the original MKV samples. What file hosting sites are
good for you in China?
Original comment by cyber.sp...@gmail.com
on 28 Oct 2011 at 7:03
The samples attached on Issue 37, crash using xy_vsfilter_test_20111028.
For example
http://xy-vsfilter.googlecode.com/issues/attachment?aid=370007001&name=MH-01-OP.
mkv&token=288e09a1b5f35204a17c0fbc5c9f52f4 crashes at the after 22 seconds.
Original comment by cyber.sp...@gmail.com
on 28 Oct 2011 at 7:17
Hotmail's SkyDrive is the only one I know that probably both you and I can use.
Original comment by YuZhuoHu...@gmail.com
on 29 Oct 2011 at 12:03
Try that "8x8 fast" sub-pixel positioning option see if it helps with
performance.
Original comment by YuZhuoHu...@gmail.com
on 29 Oct 2011 at 12:30
I emailed you a SkyDrive link.
Below are some bechmarks with AVSMeter using DirectShowSource. While not
representative of real playback fps because of the overhead, it does give you a
nice picture of how severe the slowdown is.
CCS OP2 Sorted 8x8fast 200 frames | min fps 7.11 | avg fps 13.53 |
CCS OP2 Unsorted 8x8fast 200 frames | min fps 0.69 (~10x slower) | avg fps 1.35
(~10x slower) |
CCS OP2 Sorted 8x8 200 frames | min fps 7.15 | avg fps 13.54 |
CCS OP2 Unsorted 8x8 200 frames | min fps 0.69 (~10x slower) | avg fps 1.45
(~10x slower) |
DiVB Sorted 8x8fast 1000 frames | min fps 7.07 | avg fps 16.24 |
DiVB Unsorted 8x8fast 1000 frames | min fps 1.55 (~4.5x slower) | avg fps 9.20
(~1.75x slower) |
DiVB Sorted 8x8 1000 frames | min fps 7.11 | avg fps 16.30 |
DiVB Unsorted 8x8 1000 frames | min fps 1.54 (~4.5x slower) | avg fps 9.24
(~1.75x slower) |
The "8x8 fast" option seems to make no difference in performance compared to
"8x8". If anything, the normal "8x8" may be ever so slightly faster than "8x8
fast".
Original comment by cyber.sp...@gmail.com
on 29 Oct 2011 at 2:34
Try xy_vsfilter_test_20111030.7z.
That crash issue of that preview version is fixed too.
And for this issue, the slow down from sorted versions to unsorted versions is
caused by a bug in the script parser. Unsorted versions trigger the bug, and
the consequence is just like duplicating many lines in the sorted versions. Mix
the scripts with any video,not necessarily 1080p, and observe the OSD
information while playing, a huge difference of Cache LV1 query_count, which
corresponding to the number of alphablending operations done, between the
sorted and unsorted version can be seen. To get OSD information, goto
properties->misc, check "Show OSD statistics".
The difference from my smooth feeling to your unplayable result with the
preview version may be relative to cpu architecture.
Original comment by YuZhuoHu...@gmail.com
on 30 Oct 2011 at 1:44
There is still a measurable slowdown with the CCS OP2 unsorted sample using
xy_vsfilter_test_20111030. The good news is your speed-up in that build appears
to have completely compensated for the smaller DiVB unsorted slowdown
(benchmark results were near-identical).
CCS OP2 Sorted 8x8_normal 1940 frames |min fps 44.21 | avg fps 52.33 |
CCS OP2 Sorted 8x8_fast 1940 frames |min fps 42.56 | avg fps 51.28 |
CCS OP2 Unsorted 8x8_normal 1940 frames |min fps 30.53 (~1.45x slower) | avg
fps 37.39 (~1.4x slower) |
CCS OP2 Unsorted 8x8_fast 1940 frames |min fps 29.45 (~1.5x slower) | avg fps
36.86 (~1.42x slower) |
Both samples are very playable now, but the remaining slowdown is a bit of a
mystery. At least with the CCS OP2 sample, the normal 8x8 subpixel positioning
continues to be slightly faster than 8x8fast...
Original comment by cyber.sp...@gmail.com
on 30 Oct 2011 at 1:49
I forgot to say that 8x8_fast option use an additional cache whose info is not
yet showed in OSD. The cache lies even between Cache LV1 and the afterward
alphablending operation. If using 8x8_fast option, Cache LV1's query_count no
longer equal to alphablending operation number.
And the bilinear interpolation 8x8_fast using is not yet SSE2 optimized too.
Original comment by YuZhuoHu...@gmail.com
on 30 Oct 2011 at 2:20
Original comment by cyber.sp...@gmail.com
on 16 Dec 2011 at 7:47
Original issue reported on code.google.com by
cyber.sp...@gmail.com
on 19 Oct 2011 at 9:31Attachments: