Closed GoogleCodeExporter closed 9 years ago
Have read the thread and the idea, xy-VSFilter interaction with madVR, is
great, I'm really looking forward to it. But I'll need sometime for
investigation.
Original comment by YuZhuoHu...@gmail.com
on 31 Oct 2011 at 3:21
First you'll need to wait on madshi (madVR developer) to see if he ultimately
decides he wants to modify madVR for VSFilter or not. He seems a bit torn
between using VSFilter to suit his needs (requiring modifications of both
VSFilter and madVR), or going to a major coding effort attempting to port
Libass for use in madVR.
Also see:
http://www.cccp-project.net/forums/index.php?topic=5776.msg37106#msg37106
madVR adds dummy output + input pin to for VSFilter.
madVR sends blank frame RGB32 + Alpha to VSFilter.
VSFilter renders subtitles on the Alpha channel of the RGB32 frame.
VSFilter sends the frame with rendered subtitles to madVR.
madVR blends the blank frame with subtitles into the video frame with the GPU.
That was madshi's initial idea when he brought it up to gommorah
(threaded-vsfilter developer), and of course subject to change based on
feasibility to implement. How do you think would make most sense to render
subtitles at an arbitrary resolution/framerate specified by madVR and then
output an subtitles to madVR for GPU blending into the actual video frame? You
may want to look into how the CSRI interface uses TextSub (VSFilter Avisynth
function) to render subtitles and see if you can re-purpose them into a new
custom interface for something like this.
Are you able to make a forum account and thread on Doom9
http://forum.doom9.org/ so people can discuss and provide feedback on
xy-VSFilter?
Original comment by cyber.sp...@gmail.com
on 1 Nov 2011 at 12:59
Currently, the idea
madVR adds dummy output => VSFilter => madVR,
will be easier for me since I don't have to do much work.
I can access forum.doom9.org, lucky!
Original comment by YuZhuoHu...@gmail.com
on 1 Nov 2011 at 3:49
The idea I had with the dummy output pin -> VSFilter -> madVR would probably
work ok, but it would require some ugly DirectShow graph rerouting logic,
eventually even needing specific Media Player support (not sure). Also
currently VSFilter does not fill the alpha channel of the rendered video frame.
So if madVR sends a "blank" video frame to VSFilter to render on, I'd have no
idea how to blend each pixel. So I'd have to ask VSFilter to render each frame
twice, once with a white and once with a black video frame. Then I could
probably calculate the alpha values used by VSFilter. As you can imagine this
would require a lot of work on my side and might potentially cost a lot of CPU
performance.
It would be much nicer if xy-VSFilter could get special support for madVR. I'm
thinking of something like this:
(1) VSFilterForMadVR would have only one input pin for subtitle data and no
output pin.
(2) VSFilterForMadVR would then render the subtitles on a 32bit RGBA bitmap,
with the alpha channel properly filled.
(3) VSFilterForMadVR would then send that RGBA bitmap to madVR via a private
communication channel.
That would be relatively easy for me to implement. Not sure how difficult it
would be for you. Some things to think about:
(a) private communication might have to include madVR telling VSFilter some
things about the video (video rectangle etc) because VSFilter doesn't have the
information from the video input pin, anymore.
(b) Not sure how many frames per second VSFilter would create when it doesn't
get the "timing clock" through the video frame input pin. Maybe it would make
sense for madVR to callback VSFilter to ask for an RGBA bitmap for every
rendered video frame instead of VSFilter doing the rendering with a clock of
its own?
(c) Instead of using a private communication channel, VSFilter could also offer
an RGBA output pin. But then problem (b) remains. Furthermore I'd have to add
support for madVR to accept a secondary video input pin with RGBA for blending.
I'm planning to do that at some point in the future, anyway, but it'd be a lot
of work, so it won't come soon. The easier and quicker way for me would be to
have a private communication channel.
I'm open for any other alternative ideas or any wishes you might have for
custom madVR interfaces.
Thoughts?
P.S: One small improvement you could make which would make my original idea
easier to implement would be if VSFilter would make sure that if it gets an
RGBA video frame to render on, that the alpha channel is properly filled with
the subtitle alpha. That way madVR wouldn't have to do weird tricks to figure
out how to blend the subtitles.
Original comment by mad...@gmail.com
on 2 Nov 2011 at 10:59
First
Quote:Maybe it would make sense for madVR to callback VSFilter to ask for an
RGBA bitmap for every rendered video frame instead of VSFilter doing the
rendering with a clock of its own?
Agree.
I think the best option would be VSFilterForMadVR: having only one input pin,
no output pin, and providing a callback for madVR. Although it is not the
easiest for me.
Original comment by YuZhuoHu...@gmail.com
on 2 Nov 2011 at 12:31
Re Comment 4:
I guess maybe you want to ask for an sequence of bitmaps instead of one bitmap
for every rendered video frame (via a callback) for every rendered video frame.
The sequence will looks like this:
{(dst_rect_1, argb_data_1),(dst_rect_2, argb_data_2),(dst_rect_3, argb_data_3),...,(dst_rect_n, argb_data_n)} (the sequence may go very long but a upper bound can be set)
And the bitmaps will be alphablended onto the video frame one by one.
I'm not sure if that makes a big trouble for you do it on GPU, but the benefit
is madVR can get support from other subtitle renderers easily via the same
callback. E.g.libass, it outputs a similar format.
For Vsfilter, it has to do the above alphablending on CPU to generate a final
subpic. That's one of the reasons for its slowness. Using such a interface,
then I am possible to distribute some (or all) alphablending jobs to madVR
while leave some (or no) alphablending jobs on CPU according to the script.
Original comment by YuZhuoHu...@gmail.com
on 2 Nov 2011 at 4:01
That sequence of bitmaps, is that the internal format VSFilter is working with?
I could support it, I guess, but I'm not sure if it's the best solution to
upload all those bitmaps separately to the GPU, from a performance point of
view. How many bitmaps does a sequence typically have for one video frame? It's
not one bitmap per character, is it?
Original comment by mad...@gmail.com
on 3 Nov 2011 at 6:00
Original comment by cyber.sp...@gmail.com
on 10 Nov 2011 at 11:35
You mean fr every change in the script vsfilter generates the whole new bitmap?
Sounds scary. But I don''t believe it creates it for every character. Probably
it creates one bitmap per one element. So if you have one sentence - its a one
bitmap, but if you have custom rotation applied to evety character in that
sentence or every character has different color - these will be different
bitmaps. Is that so?
<< Attention, below is a pure guesswork. >>
So *If* answer to my question above is *yes*, I think it won't be too slow for
madvr. It will be around 5 bitmaps per sequence? It should be possible to limit
bitmaps uploading when heavily styled parts are detected such as the
openings/endings or insert songs, where we can have hundreds of bitmaps =)) And
about blending these bitmaps in GPU, that's should be blazing fast.
Original comment by yakits...@gmail.com
on 16 Nov 2011 at 12:50
==================================================
Really sorry for my sudden break on this discussion last time. Though I cannot
truly return to develop this project yet, I'll continue this discussion.
==================================================
==============
Quote:
madshi:
That sequence of bitmaps, is that the internal format VSFilter is working with?
==============
Not exactly. The internal format VSFilter working with is more like a sequence
of single color bitmaps with alpha mask:
{(dst_rect_1, single_rgb_1, alpha_channel_data_1),(dst_rect_2, single_rgb_2, alpha_channel_data_2),(dst_rect_3, single_rgb_3, alpha_channel_data_3),...,(dst_rect_n, single_rgb_n, alpha_channel_data_n)}
==============
Quote:
madshi:
How many bitmaps does a sequence typically have for one video frame? It's not one bitmap per character, is it?
Quote:
yakitsume:
Probably it creates one bitmap per one element. So if you have one sentence - its a one bitmap, but if you have custom rotation applied to evety character in that sentence or every character has different color - these will be different bitmaps. Is that so?
==============
For most simple scenes, there will be 3 single color bitmaps for one video
frame: one shadow bitmap, one outline bitmap and one body bitmap. But in the
worst case, e.g. complex openning/ending script that every character/word has
different style, there will be upto 3 bitmaps per character/word.
==============
Quote:
yakitsume:
You mean fr every change in the script vsfilter generates the whole new bitmap? Sounds scary.
==============
Yes, to a certain degree. Not all things need to be re-generated with a decent
cache system.
To create a subpic, VSFilter do 3 jobs:
1) parse the corresponding script.
2) create a sequence of single color bitmaps.
3) alpha blend the above sequence to create a subpic.
Currently, in xy-VSFilter, step 1) and 2) have cache support, so they don't
need to be totally re-executed while animating. But step 3) not yet have any
cache support. All alpha blend operations will be redone to create a new subpic
(no matter how small difference it is between the new subpic and subpic
previous created).
==================================================
I've assumed:
1) It's not efficient to upload a lot small bitmaps to GPU. (So that the length of that bitmap sequence should be limited.)
2) Alpha blend operation has associativity, i.e. if we denode alpha blend bitmap sub1 on to bitmap frame1 as
frame1 AB sub1,
then
(frame1 AB sub1) AB sub2 = frame1 AB (sub1 AB sub2).
I felt that it can be done (efficently) only if I use pre-multiplied alpha. (So
that the length of that bitmap sequence can be limited, because small items can
be combined on CPU first.)
==================================================
More details on the interface:
MadVR doesn't have to ask for subpic data for every rendered video frame. A begin time and a end time can be packed together with the subpic data. They denode for when to when the subpic should be showed. During that time span, madVR can reuse the subpic data. For most simple scenes, the time span will have several secends. For animated scenes, the end time will be set to begin time + 1. It means that madVR have to ask for new subpic data next video frame. There's a problem though: that supbic data may be invalidated, e.g. user modifies the script file during playback, or new segment data subtitle pin send causes an invalidation. To solve it, an extra callback like IsSubpicInvalidated will be needed. It takes a current video frame reference time and a subpic created time as input, and simply return true/false telling the caller whether subpic created at that created time for that reference time has been invalidated.
In conclusion, a begin time, a end time and a created time can be packed together with the subpic data. MadVR keeps a record of subpic data internally. For every video frame, madVR first check if it can reuse the subpic data (if the video frame reference time fall between the begin time and the end time of the subpic data, and if the subpic data has been invalidated). If not then it ask for new subpic data.
==================================================
Original comment by YuZhuoHu...@gmail.com
on 16 Nov 2011 at 7:20
Ok, I can see we need to decide on 2 things:
(A) Who should do the alpha blending? xy-vsfilter = CPU? Or madVR = GPU?
(B) Should ask madVR for a specific frame? Or should xy-vsfilter deliver a
frame with a start/stop time?
Let's me discuss (A) first: I'm not sure myself whether letting the GPU do all
the alpha blending work would cost much GPU performance. Maybe not. It might
not be much of a problem at all. So my thinking is that we should design an
interface which allows xy-vsfilter to send a series of RGBA bitmaps to madVR
together with coordinates, and let madVR do all the alpha blending work. In the
end this interface would still allow us to go both ways. You could still do the
alpha blending in xy-vsfilter and send only one big subpic. The interface would
not stop you from doing that. You could even add an option to xy-vsfilter to
let the user decide where to do the alpha blending work (CPU/GPU). If we find
out that the GPU performance does suffer, users with a weak GPU but strong CPU
might prefer to let xy-filter to the work. While users with a strong GPU but
weaker CPU would probably prefer to let madVR do the work. Or maybe we find out
that it's not a problem at all even for the weakest GPUs, then we don't even
need an option.
Now about (B): I'm wondering whether your idea would work well. I'm not
intimately familiar with all the various ASS commands, but there's probably
some kind of "fade out" command, I would guess? Let's say in a movie the ASS
script asks you to fade out the subtitle over the time of 1 second. With a
24fps movie, this would result in you creating 24 different subtitle pictures
(identical bitmap data, but changing alpha channel). With a 60fps movie, you
would even create 60fps. So basically I guess that with some of the
animation/fade effects, the number of different subpics probably depends
directly on the movie frame rate, doesn't it? Because of that I think it might
be better if madVR asks for the subpic for a specific video frame instead of
the other way round. *However*, I would strongly suggest that we allow
xy-filter to allow saying "reuse last frame's subpic" instead of sending new
data. That should save quite a bit of performance. What do you think?
There's a new thing (C) we need to discuss, too:
Currently due to the way vsfilter works, the subtitle are usually drawn on the
video before aspect ratio correction and before upscalig. With SD content that
means that subtitles have to be upscaled quite a lot, making them rather blurry
looking. Using a private communication channel between xy-filter and madVR
would allow us to render the subtitles in the final output resolution, which
should improve subtitle rendering quality *A LOT* for SD content. I've been
told that this might be problematic in terms of aspect ratio and 3D rotations
etc. But I think this should all be solvable somehow if we take it into account
from right from the start. What do you think?
Finally, maybe we should ask JanWillems to join this discussion? He's the one
working on the MPC-HC renderers. He might be interested in this, as well.
Original comment by mad...@gmail.com
on 18 Nov 2011 at 9:19
I'll fully agree with all the suggestion.
==========
Quote:
I've been told that this might be problematic in terms of aspect ratio and 3D
rotations etc.
==========
I don't see any possible issue on my side, as long as all I have to do is
generate subpic data at specified size for specific video frame. But if there
is any actually, please let me know.
I'd like to have JanWillems join this discussion too if he is interested in it.
Original comment by YuZhuoHu...@gmail.com
on 18 Nov 2011 at 1:09
I've just been notified of this thread. I'll happily share my thoughts about
subtitle rendering. Let me introduce myself first.
I'm JanWillem32, currently involved with developing the internal set of
renderers for the MPC-HC project. I've been involved with fonts/vector
graphics, artwork, texturing, modeling, coding and stuff for 3D game and
raytracing for years now as a hobby. About two years ago, I specialized in
all-round DirectX 10 rendering. Last year December, I was "recruited" for the
MPC-HC project, after I complained about messed-up rendering stages for the
internal renderer. After that, I've also been working on the subtitle renderer
parts. That has lead me here, it seems.
I'm indeed very interested in making changes to the original or making a
new(ish) subtitle renderer. I'm very dissatisfied with the current one. The
code is really old, so many parts are outdated. I'm actually more dissatisfied
myself with the rendering quality and texturing techniques than with
performance, which seems to be the main deal with this project.
From the early days of rendering techniques, many font and image renderers used
about the method as "we" use now for the subtitle renderer: ready an empty
bitmap to paint on, fill it line-by-line by a renderer or copy from another
texture, ready the GDI or DirectDraw and make it paint the stuff on screen.
A trivial matter was to get the video adapter to blend such things and present
that on screen. Even a decade later, when done right, alpha blending
techniques, from complex spatial additive alpha blending with lighting
techniques on partical sprites, to the usual blending of icons on the Windows
desktop with Aero enabled, are really light on the GPU. (I hope that answers
the question as to what device should add the subtitles to the video render
product.)
The next item was the amount of textures, I believe? For the DirectX 8.1
project it was decided that it would probably be best to limit the amount of
textures in the pool to no more than 250 at a time, and put the smaller ones
together on 256×256 pixel textures (as far as I can remember from back then).
For modern rendering, we just make textures and use them. Paying attention to
the amount of available video and system memory is important, but we can simply
cache less ahead of time when memory runs low.
Then we come to the point of texture management. In rendering we often re-use
the same texture over and over again (transforming it a bit for each target a
bit is easy, too). Leaving instancing techniques aside for now, textures are
loaded with reference counting, associated per object to be rendered (in the
near future, as we pre-cache data). For subtitles, a similar method is easy to
implement;
subpic 1: {ABC}, the textures A, B and C hold 1 reference and the subpic object
is sent to the video renderer to be drawn on screen when called
subpic 2: { BCD}, the textures B and C get Addref() called on them, their
reference count is now 2, D is new, so it holds one reference and the subpic
object is sent to the video renderer to be drawn on screen when called
... and so on...
When a subpic invalidates, meaning that the end time for it is below the
current timer for the video renderer video, a check is made to make sure the
next subpic is ready or at least had the chance to call Addref() on textures,
and then call Release() on all textures of the subpic to be invalidated. In the
case of subpic 1, that would change the reference count of texture A to 0
(deleting it), B and C will keep one reference by subpic 2 and are preserved.
This system would spare the PCI bus from having to constantly transfer huge
textures to video memory. As a good start, updating the library for the text
interpreters would be a good thing. These currently signal the subtitle
renderer that the content is constantly animated, lowering the lifespan of
text-based subtitles to 100 ns each (ISubPic, IsAnimated()).
Before I start on ranting about how very much I dislike image renderers with
not-even-a-lowly-8-bit-integer color precision, no gamma and colorspace
correctness, and integer coordinate systems, I'd like to know if the people
over here are on the same wavelength as me. The only way I'm going to invest
time in the subtitle renderer, is when no compatibility is required with the
older versions, fundamental changes are made to the rendering techniques and
the code is kept under GPL. I'm not interested in merely rendering speed gains,
I've already refused to work with some previous "developers" for that reason.
I'll happily share the little bit of edited code I already have, defend my
arguments about why things should change and discuss things further here. We
may even get to the point of trying out the DirectX 11 font/vector renderer in
due time.
I've already changed the workings of the subtitle renderer to pass textures to
blend to the internal video renderer I'm working on in my tester builds, it
should be a trivial matter to copy the same code to the MadVR
allocator-presenter. (Using AlphaBlt() to do it, is just messy.) I can offer
that much for now, at least.
Original comment by janwille...@hotmail.com
on 19 Nov 2011 at 12:40
"The only way I'm going to invest time in the subtitle renderer, is when no
compatibility is required with the older versions, fundamental changes are made
to the rendering techniques and the code is kept under GPL. I'm not interested
in merely rendering speed gains, I've already refused to work with some
previous "developers" for that reason."
@JanWillem32
Can you make clear what you consider 'compatibility with older versions'? Are
you talking about VSFilter.dll, the subtitle rendering in MPC-HC, or both? We
have no desire to maintain any backwards compatibility with the MPC-HC internal
subtitle render if that's what you are referring to. The plan was to start from
scratch and code an entirely new/improved/higher-quality subtitle interface
between VSFilter.dll and video renderers which choose to support it. The goal
was to implement this new interface within VSFilter.dll as an external filter,
which is a bit different than the work you've been doing in MPC-HC. Is this a
problem?
VSFilter.dll on the other hand, there is a limit to how much you can change how
ssa/ass scripts are rendered without breaking millions of existing subtitle
scripts. Bitmap based subtitles like VOB and PGS can be improved and changed to
your hearts content. Similar to Libass, quality improvements are welcome, but
significant behavior changes from how VSFilter should either be avoided or
VSFilter compatibility toggles added whenever possible.
The coder of xy-VSFilter (I'm not a coder, only assisting with support &
management for xy-VSFilter) would only be focusing on implementing things under
the control of xy-VSFilter. It would be you and madshi doing all coding related
to handling of the subtitles after they are passed off to your video renderers.
How and in what way xy-VSFilter interacts with the video renderers would be a
joint collaboration effort. To a certain extent you would have free reign to do
whatever you want after subtitles are out of the hands of xy-VSFilter, though
it would be nice to have standardized behavior.
I'm just throwing this out there before the main coder replies, to clear up any
confusion. Up to this point the xy-VSFIlter project has only involved revamping
the external VSFilter dll, while the work you've been doing has only involved
revamping the MPC-HC internal subtitle and video renderers. What form this new
implementation takes (extend xy-VSFilter, replace the MPC-HC internal subtitle
renderer, or both) is still up for debate, but up to this point the coder of
xy-VSFilter has completely ignored the MPC-HC implementation.
Last but not least, it will still likely be a few months before any work on
this gets underway. First things first, the rest of the planned features need
to be implemented, xy-VSFilter (which is based on VSFilter 2.39) brought up to
date with any important changes in 2.40, fix all the known issues, and then
release a stable version.
Original comment by cyber.sp...@gmail.com
on 19 Nov 2011 at 6:18
Thanks for such an elaborate answer.
A new and improved set of interfaces from a subtitle renderer is indeed what
I'm looking for. The current interfaces and outputs don't properly serve any
decent video renderer at all. I mostly wanted to point out that I'm not willing
to work on a subtitle renderer that can paint on the raw video surfaces
directly.
I'm all for staying true to the ssa/ass script standards. (Although I would
love to see improvements for the specifications, but that's not my job.)
I'm annoyed by the many basic flaws in the subtitle renderer.
The bitmap based subtitles are indeed a nice example of that. These are
converted from their native storage formats, to what is assumed to be the same
as the output RGB type of the video. The conversion is typically done in a
rather inefficient manner, using either a R4G4B4A4 or R8G8B8A8 DirectX texture
at the complete screen size to paint on. Using a DirectX texture type closer to
the original storage format and of the same size as in the original subtitle
stream would save a lot of processing and memory copying. During the texture
blend operation on the GPU, the original alpha and color values are transformed
on the shadercore anyway, adding two or tree more dot operations to to perform
color conversion to match the destination format is perfectly fine there. (A
dot operation takes only one assembly instruction on the GPU. A CPU can
actually do that too, using the SSE4 DPPS instruction, although without using
arbitrary swizzling of registers, which is often employed on the GPU's
shadercore.) The subtitle renderer only needs to specify what format the
incoming texture is in. Setting a pixel shader to convert the color format of
the buffer and then alpha blend that to the render target is easy (+possibly
resizing the texture during that process).
At least the bitmap based subpictures are flagged as never animated, and are
retained for several frames.
I'll nag about the flaws in the text renderer some other time...
At least it seems we'll be able to work together on a revamped subtitle
renderer. If it will serve a proper quality output to video renderers, have
proper extensibility for additional libraries and documentation for future work
on it, I'll gladly help. I don't mind if it takes a while to start, nor to
finish. I would just like things to change for the better.
Original comment by janwille...@hotmail.com
on 19 Nov 2011 at 10:18
stop hacking on vsfilter, jesus christ
you need to write something new from scratch, the vsfilter codebase is beyond
all hope of salvation
Furthermore, if you actually want people to use a new subtitle renderer, thou
shalt not break existing typesetting. You must parse and render ASS exactly the
same as VSFilter does, almost down to a bug-for-bug compatibility level.
Original comment by kalle.bl...@gmail.com
on 19 Nov 2011 at 3:04
"Furthermore, if you actually want people to use a new subtitle renderer, thou
shalt not break existing typesetting. You must parse and render ASS exactly the
same as VSFilter does, almost down to a bug-for-bug compatibility level."
TheFluff, there is no new subtitle renderer. This is the same VSFilter 2.39.x
you've always known, just faster with a few extra features tacked on. The goal
of xy-VSFilter thus far has always been bug-for-bug compatibility with zero
regressions from VSFilter 2.39 and I don't see that changing.
The goal of this Issue #40 is not to create an entirely new subtitle renderer,
but to create a replacement for the horrible ISubRender interface which has
been known to break typesetting because it's NOT bug-to-bug compatible with
VSFitler. The MPC-HC internal subtitle renderer needs to die, and implementing
a new better interface between VSFilter and video renderers is a strong step
forward towards that goal.
That said, I'd really like to see someone revive Kumaji and finalize the AS5
spec so VSFilter itself can die. Since there appears to be little interest in
either, we have to make do with what we have. Unfortunately, fansubbers have
become more daring with soft-subbing over the past few years, and we've since
seen an increased frequency of karaoke and typesetting which at times doesn't
even play in real-time on an overclocked Core i7. The need for a faster
VSFilter with bug-to-bug compatibility with VSFilter 2.39 is why xy-VSFilter
was born.
Original comment by cyber.sp...@gmail.com
on 19 Nov 2011 at 4:47
TheFluff: "you need to write something new from scratch, the vsfilter codebase
is beyond all hope of salvation"
Of course we need, but here is the problem - there is no one who can do this.
Kumaji was a great idea but its too far for reality. So how many years we
should wait till there will be someone who can bring brand new renderer? 5? 10?
While we waiting for this to happen it is good idea to improve what we have now
instead of making yourselves suffer 10 more years.
Original comment by yakits...@gmail.com
on 19 Nov 2011 at 9:44
@Jan, there are 2 totally separate issues:
(1) Working on a subtitle renderer.
(2) Creating a new interface through which *any* subtitle renderer (e.g.
xy-vsfilter) and *any* video renderer (e.g. madVR or EVR-Custom) can exchange
data and information.
We're currently talking about (2). The original purpose of the new interface
was to allow "xy-vsfilter" and "madVR" to talk to each other through a private
interface. Something similar to "ISubRender" (which you probably know), but we
want something better than ISubRender. But then we thought it would make sense
to invite you, too, since you're the "MPC-HC EVR guy". We're hoping you would
be willing to participate in creating the new interface. And that you will add
support for it to the MPC-HC VMR/EVR renderers.
It seems that you understood our invitation to join here as a suggestion to
join the xy-vsfilter project as a developer. That was not our original
intention, but now that you mention that, I think YuZhuoHuang probably would be
glad to get help! Of course I can't speak for him, though. He's already done
both performance and quality tweaks. E.g. he added P010 (10bit) input/output
support. I'm not sure what his final target is for xy-vsfilter. Maybe you are
aiming higher than he is? Don't know, maybe the two of you should discuss this
via email or chat or something.
For now I hope you'll join the discussion about how a new "subtitle renderer
<-> video renderer" interface should ideally look like. Let me sum up a few
things, and then you can check whether you agree or disagree:
(a) subtitles should be transported as RGBA bitmaps, no D3D involved
(b) subtitles should not be rendered/blended onto the video images by the
subtitle renderer, that's the video renderer's job
(c) the video renderer should ask the subtitle renderer for one big RGBA bitmap
(or for a series of smaller RGBA bitmaps) for every video frame; the subtitle
renderer should reply with one big (or a series of smaller) RGBA bitmap(s); the
subtitle renderer can also reply with "same as last frame" to save
resources/performance
(d) the video renderer can request the subtitles to be rendered directly to the
target resolution (after upscaling) to improve quality
If implemented this way, the xy-vsfilter DirectShow filter would have exactly
one input pin for subtitle data. No further input/output pins. And it would
automatically work with DXVA etc with all video renderers which support the new
interface.
Any comments?
Original comment by mad...@gmail.com
on 19 Nov 2011 at 10:19
[deleted comment]
@Jan I have few knowledge on D3D. It's a bit hard for me to fully understand
all the technic points you're talking about. But Madshi has summed up things
perfectly. We can discuss things other than "subtitle renderer <-> video
renderer" interface via email or open another thread/issue.
For the interface, the 4 points Madshi listed are all I have agreed. Now seeing
Jan's comments, I have a few questions:
"(a) subtitles should be transported as RGBA bitmaps, no D3D involved"
Should the interface leave an opportunity for subtitle renderer to involve D3D?
Or will that only bring unnecessary complexity for both sides?
"(c) the video renderer should ask the subtitle renderer for one big RGBA
bitmap (or for a series of smaller RGBA bitmaps) for every video frame; the
subtitle renderer should reply with one big (or a series of smaller) RGBA
bitmap(s); the subtitle renderer can also reply with "same as last frame" to
save resources/performance"
We can do more (?). When some of the subtitle renderer's reply are the same as
last frame, e.g. the subtitle renderer replied a series of bitmaps {A B C} for
last frame and a series of bitmaps {C B D} for current frame, B/C in the first
series and B/C in the second are granteed to point to the same objects, or a
easy and quick equality comparison method between two bitmaps is provided, so
that it won't be hard to detect difference between two replies. Then the reply
option "same as last frame" in the interface won't be needed.
Original comment by YuZhuoHu...@gmail.com
on 20 Nov 2011 at 2:32
libass is actually pretty good these days, all it really needs is a dshow
filter and a win32 font picking backend to avoid fontconfig, why don't you guys
get crackin' on that instead
Original comment by kalle.bl...@gmail.com
on 20 Nov 2011 at 5:44
There is no reason why we couldn't have both. Finishing up work on xy-VSFilter
should still be completed so it can replace VSFilter 2.39/2.40 as the de facto
VSFilter build, but you do make a good point that any new
subtitle_filter->video_renderer interface we create should be flexible enough
to be adapted to libass or any other subtitle renderer which is created in the
future.
That said, developer interest in porting libass to directshow has certainly
increased over the past year. YuZhuoHuang (xy-VSFilter dev), Gommorah
(threaded-VSFilter dev), Madshi (madVR dev), Nevcairiel (LAV-Filters dev), and
Lachs0r (mplayer2 win32 builds) have all expressed interest porting libass at
some point in time. If we get yourself, jfs, and the other Aegisub devs
on-board, the massive undertaking creating a directshow-based libass may
actually be feasible as a joint-effort. So far all the devs mentioned so far
have only considered porting libass independently for their own purposes, so
*someone* would need to put forth quite a bit of effort to gather everyone
together and convince them to collaborate on a porting effort.
Original comment by cyber.sp...@gmail.com
on 20 Nov 2011 at 6:28
> Should the interface leave an opportunity for
> subtitle renderer to involve D3D?
I see no advantage in doing that. The video renderer is in the best position to
decide if, when, how, in which format and in which thread to upload the
subtitles to the GPU. E.g. imagine the video renderer uses OpenGL instead of
D3D.
> We can do more (?). When some of the subtitle renderer's
> reply are the same as last frame, e.g. the subtitle
> renderer replied a series of bitmaps {A B C} for last
> frame and a series of bitmaps {C B D} for current frame,
> B/C in the first series and B/C in the second are granteed
> to point to the same objects, or a easy and quick equality
> comparison method between two bitmaps is provided, so that
> it won't be hard to detect difference between two replies.
> Then the reply option "same as last frame" in the interface
> won't be needed.
Hmmmm... Does it often happen in real life that some parts of the subtitles
stay identical while others change, from one video frame to the next? If it
does, then yes, we should allow the subtitle renderer to pass on this
information somehow.
> libass is actually pretty good these days, all it really
> needs is a dshow filter and a win32 font picking backend
> to avoid fontconfig, why don't you guys get crackin' on
> that instead
AFAIK, libass supports ASS subtitles, nothing else. vsfilter supports pretty
much every subtitle format out there (both text and bitmap based). Furthermore
some ASS subtitles depend on the bugs in vsfilter to show perfectly. As a
result my opinion is that we should ideally have both, vsfilter and libass. If
YuZhuoHuang continues to improve xy-vsfilter, that should be a pretty good
thing, IMHO. Getting xy-vsfilter improved and work with the new interface we're
discussing should be much easier and quicker than developing a completely new
subtitle renderer with support for all those funny subtitles formats out there.
In the long run, maybe vsfilter will be replaced by a completely new subtitle
renderer. But that will take time, and it shouldn't stop us from improving
xy-vsfilter in the meanwhile. And the interface we're dicussing should help for
both.
@YuZhuoHuang, I've found this page:
http://code.google.com/p/libass/wiki/IssuesAndDifferences
It says: "\blur is scaled like border width if ScaledBorderAndShadow is on.
VSFilter does not do any scaling. Likewise, the viewing distance for rotations
is scaled. The goal is to get the same rendering result independent of
rendering resolution."
So it seems having xy-vsfilter render to the upscaled target resolution may
make problems, after all, with blurring and rotations. But maybe you can find a
way to correct that, somehow?
Original comment by mad...@gmail.com
on 20 Nov 2011 at 9:00
"It says: "\blur is scaled like border width if ScaledBorderAndShadow is on.
VSFilter does not do any scaling. Likewise, the viewing distance for rotations
is scaled. The goal is to get the same rendering result independent of
rendering resolution.""
That is something which should not be fixed globally, otherwise it would break
scripts which depend on this VSFilter behavior like
http://code.google.com/p/libass/issues/detail?id=6 with \blur.
What this likely means is part of this new interface would involve making
xy-VSFilter aware of the video resolution and aspect ratio in addition to the
target resolution. First the Script_Resolution -> Video_Resolution blur &
rotations would need to be calculated unscaled, and then from there
Video_Resolution -> Target_Resolution for everything would need to be
calculated as scaled (simulating resizing but rasterizing at higher resolution).
Does that sound feasible YuZhuoHuang? It may almost make sense to just have
xy-VSFilter pass bitmaps with blur only and have the video renderer scale them
its default interpolation to target resolution, then blending the scaled blur
bitmaps with the subtitle bitmaps which are already passed at target
resolution? Getting blur to look identical when rasterized at a higher
resolution I suspect is tricky, and may force us to interpolate blurs? Thoughts?
Original comment by cyber.sp...@gmail.com
on 20 Nov 2011 at 10:21
> libass is actually pretty good these days, all it really
> needs is a dshow filter and a win32 font picking backend
> to avoid fontconfig, why don't you guys get crackin' on
> that instead
Creating a libass dshow filter that works like VSFilter requires quite a lot of
work, and definitely the basic flaws, in both performance and quality, of the
way VSFilter works will be inherited. We won't have a libass-filter works as
good as it works in mplayer. But it would be much more easy to make a libass
based subtitle render supporting the interface we're now dicussing.
> imagine the video renderer uses OpenGL instead of D3D.
Got it.
> Does it often happen in real life that some parts of the
> subtitles stay identical while others change, from one
> video frame to the next?
Very often. E.g. a moving text or a simple karaok effect.
> "\blur is scaled like border width if ScaledBorderAndShadow
> is on. VSFilter does not do any scaling. Likewise, the
> viewing distance for rotations is scaled. The goal is to get
> the same rendering result independent of rendering resolution."
I think even though there are some tags whose behavior (in VSFilter) depends on
the resolution, subtitle render can use an extra resolution information for
defining such tags, while outputing in another resolution. So the problem is
solvable. This extra resolution information can be
1.either the actual resolution of the video (setted by video renderer in the
initial step), then the subtitle renderer can act exactly the same as VSFilter;
2.or the resolution the subtitle renderer read from the subtitle script, then
output of the subtitle render is totallly independent to the video.
Original comment by YuZhuoHu...@gmail.com
on 20 Nov 2011 at 12:09
Here's an experimental dshow filter of libass:
https://github.com/Arnavion/libassDShow
Original comment by astrat...@gmail.com
on 20 Nov 2011 at 2:10
taro, it is nice to see attempts to make DS filter out of libass, but I believe
it is not clear enough how it is useful for current project. If you think that
libassDShow also need to support the interface that is discussed here, then you
probably should make that proposition to libassDShow author.
Original comment by yakits...@gmail.com
on 20 Nov 2011 at 3:27
> (a) subtitles should be transported as RGBA bitmaps, no D3D involved
> (b) subtitles should not be rendered/blended onto the video images by the
subtitle renderer, that's the video renderer's job
> (c) the video renderer should ask the subtitle renderer for one big RGBA
bitmap (or for a series of smaller RGBA bitmaps) for every video frame; the
subtitle renderer should reply with one big (or a series of smaller) RGBA
bitmap(s); the subtitle renderer can also reply with "same as last frame" to
save resources/performance
> (d) the video renderer can request the subtitles to be rendered directly to
the target resolution (after upscaling) to improve quality
Sounds like a plan to me (which i also would implement in LAV Video or wherever
i end up rendering subs).
The interface seems easy enough to use, producing RGBA from both text and
bitmap subs is usually an easy task. I'm not sure the "series of smaller
bitmaps" is really needed, its not too much to ask the subtitle renderer to
merge them into one RGBA image - but if you're going for the ultimate
interface, sure why not.
Original comment by h.lepp...@gmail.com
on 23 Nov 2011 at 9:42
[deleted comment]
Nev, you see, "series of smaller bitmaps" is just how vsfilter works
internally. It is not a problem to merge images, its just that this is slow
approach.
Of course some other renderers may work differently or may be not so slow doing
that but vsfilter support can't be just dropped.
Original comment by yakits...@gmail.com
on 23 Nov 2011 at 10:00
Libass also works like that, however I don't think the performance of that
operation is that slow. Something needs to merge them to one big image and
upload it to the gpu. Where that merge happens is not important, and could
easily be done in the sub renderer itself.
Original comment by h.lepp...@gmail.com
on 23 Nov 2011 at 10:25
@astrataro
The project libassDShow is still far from usable. It hardly help with anyone.
And definitely it will be better and easier to support the interface we are
discussing.
@Nev
Fansubbers often place one subtitle on the top and one on the buttom. For such
scripts, if the interface allows a series of bitmaps, passing only the dirty
area will be possible. Some CPU->GPU communication can be saved (I guess) in
comparison to passing a big bitmap that contains both the top subtitle and the
buttom subtitle.
> (c) the video renderer should ask the subtitle renderer for one big RGBA
bitmap (or for a series of smaller RGBA bitmaps) for every video frame; the
subtitle renderer should reply with one big (or a series of smaller) RGBA
bitmap(s); the subtitle renderer can also reply with "same as last frame" to
save resources/performance
Add one rule: the video renderer can set a max limit on the number of smaller
RGBA bitmaps return for one frame.(So the video render has a choise to force
subtitle renderer to reply with only one big RGBA bitmaps.)
Another 2 problems:
1) If I want to pre-buffer for the future video frames, I'll need their
presentation time. Probably I'll use a fix framerate, and caculate the
timestamps of future video frames myself. It *works* even if the framerate
differ from the actual video framerate. When the video renderer asking for a
subpic, an algorithm similar to nearest neighbor can be employed to search the
pre-buffered sub-picture queue and the sub-picture nearest to the one the video
renderer asking for will be returned. But of course it'd better to use the same
framerate as the actual video framerate. So the video should set a framerate to
the subtitle renderer.
2) If the subtitle renderer works as a directshow filter, then how can it
autoload when there are only external subtitles but no embedded subtitles?
Original comment by YuZhuoHu...@gmail.com
on 24 Nov 2011 at 12:50
Can you describe what purpose pre-buffer has exactly? Is it meant to make sure
that the CPU is never idle? FWIW, madVR already pre-buffers many frames itself.
Wouldn't that be good enough already? Or does it still make sense in your
opinion to pre-buffer inside of xy-vsfilter, too?
No idea about 2). I don't know how autoloading works.
Original comment by mad...@gmail.com
on 24 Nov 2011 at 1:44
Ah, i didn't think of things like having two separate dirty regions.
I was more thinking about how libass works. It gives you long list of 1-bit
alpha maps and 32-bit color, and its your job to combine these into a finished
RGBA image anyway. I'm not against the whole "multiple small RGBA images"
thing, i just couldn't think about a real use case.
At 1), that occured to me as well.
Pre-buffering can be useful, especially if your PC is not super fast. There is
always times without speech, when you could render subtitles already, which are
then all used in the next dialog.
2).. Thats an old problem. IMHO, it should be the players responsibility to
detect external subs and load a subtitle renderer for them.
Original comment by h.lepp...@gmail.com
on 24 Nov 2011 at 2:09
The purpose would be to increase performance efficiency within VSFilter beyond
just increasing CPU utilization. Remember that VSFilter is single-threaded,
which means slowdowns bring everything to a halt, and when that happens the
madVR queues have limited usefulness. Even stock VSFilter gets something like a
10x-20x speed-up with pre-buffering enabled on my Core i5 compared to no
pre-buffering. If xy-VSFilter could get even a fraction of that speed-up just
by splitting some pre-buffering code and related operations into another
thread(s), it would be rather amazing.
"Quote madshi:
IMHO you should change the VSFilter design so that you have one secondary
thread (or multiple secondary threads) to do all the rendering work. These
secondary render threads would store their rendering results in an internal
rendering queue. Your main thread which is calling "CBaseOutputPin::Deliver()"
would then do nothing but fetch already rendered frames from the internal queue
and deliver them to the video renderer. The secondary thread(s) would then
render as fast as they can, until the internal buffer queue is full. When the
queue is full, your secondary render thread(s) should go to sleep. When
"CBaseOutputPin::Deliver()" returns, your main thread can delete the delivered
frame from the queue, and then wakeup the secondary thread to fill up the empty
spot in the internal queue by rendering the next frame."
I don't believe YuZhuoHuang has yet decided what form a pre-buffer in
xy-VSFilter will take, as it will likely be completely different than stock
VSFilter. madshi, has your opinion (quoted above) about the optimal way to
create an internal VSFilter pre-buffer, in order to remove bottlenecks when
interacting with madVR's decoder buffer, changed in any way since then?
Original comment by cyber.sp...@gmail.com
on 24 Nov 2011 at 2:30
The main problem with pre-buffering remains - it can only function if you know
what frames will be coming next.
With CFR material, thats trivial - just need to be told the frame rate of the
movie, and you can compute the frames ahead. VFR content is a whole different
problem. Sadly those anime folks are also those that sometimes use VFR material.
The good old worker thread design that madshi outlines above is still good,
cannot go wrong with it, really. But you do need to know what to actually
pre-render.
Original comment by h.lepp...@gmail.com
on 24 Nov 2011 at 2:35
> Can you describe what purpose pre-buffer has exactly?
> Is it meant to make sure that the CPU is never idle?
> FWIW, madVR already pre-buffers many frames itself.
> Wouldn't that be good enough already? Or does it still
> make sense in your opinion to pre-buffer inside of
> xy-vsfilter, too?
I know madVR has a buffer queue and a subtitle renderer for madVR would not
need to pre-buffer indeed. But for a common video renderer, it may not have
such machnism. The main purpose for pre-buffering (if I am to do that) is to
prevent sudden heavy script from causing *lag*, not to make to full usage of
CPU. I'd prefer to reduce CPU usage other than increasing it to make subtitle
render faster. After thinking on your question, I got a feeling that my purpose
can be fulfilled in another way, instead of pre-buffering the subpic. Anyway,
maybe someone else interesting in implementing a subtitle renderer that
supports this interface would like to pre-buffer?
> No idea about 2). I don't know how autoloading works.
For VSFilter, since it has a video input pin and a output pin, it can always
connect to the filter graph, and check if there is any subtitle
(internal/external) to decide should it autoload. Now with the video input pin
and output pin removed, when there is no interal subtitles, the splitter would
not have any subtitle output pin, and the subtitle renderer cannot connect to
the graph.
Original comment by YuZhuoHu...@gmail.com
on 24 Nov 2011 at 2:40
Why not just leave an Input & Output Pin to pass-through decoded video to the
Video Renderer untouched (while subtitles would still use this new interface
via callback)? Wouldn't that resolve the pre-buffering, auto-loading, VFR, and
various other problems?
Original comment by cyber.sp...@gmail.com
on 24 Nov 2011 at 2:54
> Why not just leave an Input & Output Pin to pass-through decoded video
Will it break DXVA?
Original comment by YuZhuoHu...@gmail.com
on 24 Nov 2011 at 3:29
Nothing can sit between the decoder and the renderer in DXVA.
Its sadly also not that trivial to just have video pass-through, because the
renderer dictates how the video frame should be setup (stride), so either you
need to support adjusting the image stride, or somehow forward such
requirements to the decoder itself. Both options are sadly not trivial.
Original comment by h.lepp...@gmail.com
on 24 Nov 2011 at 3:39
Then how about combining a portion of the dummy pin method with the callback
method?
Add a subtitle input pin to the Video Renderer and have it handle autoloading?
+
Add a dummy video output pin from the Video Renderer to a subtitle renderer, to
inform it of timestamp information making pre-buffering and VFR possible?
+
This my also be a good time to make use of the extendable Open Media Format
http://sourceforge.net/projects/openmediaformat/ to pass static information to
the subtitle filter via the dummy output pin?
Though, this method would only make sense if the frame-rate/timestamp and
auto-loading issues couldn't be more easily solved by other means.
Original comment by cyber.sp...@gmail.com
on 24 Nov 2011 at 10:47
> Nothing can sit between the decoder and the renderer in DXVA.
>
> Its sadly also not that trivial to just have video pass-through,
> because the renderer dictates how the video frame should be setup
> (stride), so either you need to support adjusting the image stride,
> or somehow forward such requirements to the decoder itself. Both
> options are sadly not trivial.
Got it.
> 2).. Thats an old problem. IMHO, it should be the players
> responsibility to detect external subs and load a subtitle renderer
> for them.
Hmmmm... I agree that someone else, I don't mind if it is a player or MadVR
though, should load the subtitle renderer first. But better leave the detection
work to subtitle renderer since it knows what it can deal with?
Original comment by YuZhuoHu...@gmail.com
on 25 Nov 2011 at 12:08
For compatibility sake, I think it would be best to avoid offloading any
responsibility of supporting the new interface to player, if in any way
possible.
If madVR adds support for this new interface, it should just work in any player
which supports madVR without any additional coding hurdles.
Original comment by cyber.sp...@gmail.com
on 25 Nov 2011 at 2:30
(1) Pre-buffering:
As Hendrik said, the main problem is that if you want to pre-buffer, you need
to know which start/stop times future video frames will have. What happens if
xy-vsfilter prebuffers for a specific future frame start/stop time and then the
video renderer unexpectedly asks for subtitles for a video frame that is right
between pre-buffered start/stop times? I'm not sure how this could be handled.
Anyway, is there any need (or any use) for adding explicit pre-buffering
support to the interface? As far as I can see, if xy-vsfilter prebuffers
internally, the video renderer doesn't even have to know that prebuffering is
used, or does it? Of course we could add some prebuffer control to the
interface, but I'm not sure what purpose that would have exactly? I mean what
could the video renderer do? Maybe it could turn prebuffering on/off, but
that's all that comes to my mind.
Any more thoughts on prebuffering, anyone?
(2) auto-loading
Maybe this could be a task performed by LAV Splitter? I mean, if LAV Splitter
detects external subtitles, it could just load them and behave as if they were
part of the video file, too. Same with external audio tracks, btw. I've been
wishing for a splitter which can auto load external audio and subtitle tracks
for a long time. Ideally I'd like to store all my audio/video tracks demuxed
and let the splitter pick them up automatically. But well, that's a different
topic and Hendrik and I had a short discussion about this some months ago, IIRC.
Please don't require madVR to have a dummy output pin. Instead madVR could just
load and add xy-vsfilter to the graph manually. That would be *MUCH* easier and
should have the same effect. But still, I'd find it nicer to have the splitter
do the work of making external audio/subtitle tracks available.
BTW, is there any way to get email notifications if someone adds a comment
here? I've searched but didn't find anything.
Original comment by mad...@gmail.com
on 26 Nov 2011 at 11:17
> BTW, is there any way to get email notifications if someone adds a comment
here? I've searched but didn't find anything.
"Star" the issue (click on the Star next to its name on the top)
Re: Auto-loading
It is a possibility that the splitter trys to detect external subs, however its
still a functionali difference to how it works now, which you seemed to want to
avoid in general? (ie. it wouldn't work with any other splitter)
Original comment by h.lepp...@gmail.com
on 26 Nov 2011 at 11:22
Auto-loading: I see 2 viable approaches:
(1) The new interface we're discussing makes sense only if the video renderer
supports it. So it would be no problem to require every video renderer which
supports the new interface to manually load xy-vsfilter. Ok, where to find the
xy-vsfilter dll file? There are various ways how we could solve that. E.g. we
could define a registry key which lists the user's choice for the auto-loaded
subtitle renderer. Or alternatively the video renderer could just remember the
subtitle renderer which was used "last time". And if no subtitle renderer is
found in a graph, the video renderer could then manually load the "last time"
subtitle renderer.
(2) LAV Splitter supporting external subtitle tracks. Yeah, you're right, this
would obviously not work when using other splitters. It would be good enough
for my needs, though, so I'd be fine with it.
Maybe (1) is better? I still generally do like (2), though.
---------
I've created a first detailed interface suggestion and uploaded it here:
http://madshi.net/SubRenderIntf.h
Comments very welcome. I'm not sure about a couple of things. Here are some
comments:
(a) I wasn't sure who should initiate the connection (sub renderer or video
renderer). I decided on the video renderer because the video renderer is
usually the last filter in the chain which gets a pin connection. Only at that
point in time the video renderer has enough information to establish the
connection.
(b) I've used strings for the "option" parameter in all the ISubRenderOptions
methods. I know it's a matter of taste. If you guys prefer a DWORD enum or a
GUID instead that'd be fine with me, too.
(c) I'm not sure about interfaces and reference counts. Please double check the
comments I've added to "ISubRenderProvider.Connect" and
"ISubRenderServices.RenderFrame". Does it make sense to you this way? Or should
we do the reference counting differently? I'm not really sure...
(d) I've used an extra interface for every rendered subtitle frame. Not sure if
that makes sense. Maybe it makes things just more complicated than necessary.
We could also return the bitmaps as a simple array via
"ISubRenderServices.RenderFrame", if you prefer it that way. Thoughts?
Just a first suggestion. Any comments / change suggestions welcome!
Original comment by mad...@gmail.com
on 26 Nov 2011 at 1:16
> What happens if xy-vsfilter prebuffers for a specific future
> frame start/stop time and then the video renderer unexpectedly
> asks for subtitles for a video frame that is right between
> pre-buffered start/stop times?
If subpics for start/stop times [t0,t2) and [t2,t4) have been prebuffered, but
subpic [t1,t3) is asked for, the subpic whose start/stop period includes
(t1+t3)/2 will be returned, e.g. if t2 <= (t1+t3)/2 < t4, subpic for [t2,t4)
will be returned. So if the subtitle render works in a framerate (very)
different from actual playback, animated effects, e.g. moving/rotation/fading
in/fading out, won't be smooth.
> the video renderer doesn't even have to know that prebuffering
> is used, or does it?
No, it doesn't. I don't think the video renderer needs to worry about it.
Auto-loading:
> (1) The new interface we're discussing makes sense only if the video
> renderer supports it. So it would be no problem to require every
> video renderer which supports the new interface to manually load
> xy-vsfilter.
> (2) LAV Splitter supporting external subtitle tracks.
I think (1) is better, for compatibility.
Detailed interface:
----------
> who should initiate the connection
I vote the video renderer.
----------
In interface ISubRenderOptions:
I want video file name to search for corresponding external subtitles. Should I
get it from ISubRenderOptions?
----------
In ISubRenderFrame:
> // The ID can stay identical if only the position (x, y) changes.
> ...
> STDMETHOD(GetBitmap)(int index, ULONGLONG *id, RECT *placement, LPVOID
*pixels, int *pitch);
1) I'm not sure if random access is necessary, but I can go with it.
2) The ID can stay identical if only the *placement* changes?
3) And the callee should guarantee not to modify pixels, if using
LPCVOID *pixels
instead of
LPVOID *pixels
makes sense?
----------
In ISubRenderFrame:
> STDMETHOD(GetCombinedBitmap)(RECT *placement, LPVOID* pixels, int *pitch);
I'd prefer a
STDMETHOD(SetMaxBitmapCountPerFrame)(DWORD count) = 0;
in ISubRenderServices. Consumers which don't want a series (or a long series)
of smaller bitmaps can set the upper limit to what them want. And this setting
can be expected that remains unchanged for a long time once it is set. Knowing
the setting before RenderFrame calls may help me decide what to cache or
prebuffer.
----------
Original comment by YuZhuoHu...@gmail.com
on 27 Nov 2011 at 7:16
> who should initiate the connection
I vote the subtitle renderer. Its easier for the sub renderer to support
multiple interfaces (ie. madVRs new interface, EVRs old interface, or falling
back to drawing onto the plain image) if it doesn't have to "wait" if a
renderer offers an interface. Instead i can be sure if there is an interface,
or not.
-------------------
What really bugs me about the interface is the way memory management is done
for the options.
Handing out memory pointers with a rule that they should stay valid is rather
obscure, imho. Instead, i would follow the MS interfaces, and just let the
callee allocate the data using a pre-determined function (ie. CoTaskMemAlloc),
and make the caller responsible for freeing with the matching function (ie.
CoTaskMemFree). At least for the options i would prefer it this way.
For the ISubRenderFrame, i guess its ok'ish to hand out fixed pointers, because
its a object which actually holds the subtitle data, and those functions are
just "getters" to expose the internal data - you would have freshly allocated
them already anyway. I would however adjust the comment, and instead say
something like this:
// The memory pointed to by the "pixels" variable is only valid
// until the next call of GetBitmap, GetCombinedBitmap, or Release
Otherwise, i guess its ok
Original comment by h.lepp...@gmail.com
on 27 Nov 2011 at 7:35
Fix typo:
> 3) And the callee should guarantee not to modify pixels, if using
Should be
"3) And the caller should guarantee not to modify pixels, if using"
Original comment by YuZhuoHu...@gmail.com
on 27 Nov 2011 at 8:14
Original issue reported on code.google.com by
yakits...@gmail.com
on 30 Oct 2011 at 6:08