Custom interface for rendering subtitles on an RGBA texture(s) | (XySubFilter for madVR) [Part 1]

GoogleCodeExporter commented 8 years ago

This is a kind of feature request. Fot the info you better see the link beliw, 
since I don't feel like past all conversation here.
http://forum.doom9.org/showthread.php?p=1535165#post1535165

Original issue reported on code.google.com by yakits...@gmail.com on 30 Oct 2011 at 6:08

Merged into: #91

GoogleCodeExporter commented 8 years ago

And further more, I found something like this:

DXPMSAMPLE — A structure with red, green, blue, and alpha components, each 
having 8 bits. The color channels are premultiplied by the alpha component for 
this type.

http://msdn.microsoft.com/en-us/library/aa753543%28v=vs.85%29.aspx

Does it help for using pre-multiplied alpha on GPU?

Original comment by YuZhuoHu...@gmail.com on 29 Nov 2011 at 1:27

GoogleCodeExporter commented 8 years ago


I fixed my code, its only one division (per pixel), and not one per iteration, 
however for precision it should be calculated in floating point or at least 
16-bit integer, and not in 8-bit integer, of course.

 RGBbig = Abig = 0
 foreach(subpic)
   RGBbig = (RGBbig * (1-Asub) + RGBsub * Asub)
   Abig = (1 - ((1-Abig) * (1-Asub))
 end
 RGBbig = RGBbig / Abig

> I do no serious test, but I am quite positive that do A1*B1+A2 using MMX/SSE2 
would be faster than A1*B1+A2*B2.

A1*B1+A2*B2 is one SSE2 instruction, and processes up to 4 16-bit pixels at the 
same time. Implementing A1*B1+A2 would probably end up being A1*B1+A2*1, just 
so that the same instruction could be used.

I'm not sure why you insist on pre-multiplied alpha, i'm still not convinced of 
any advantages, if anything i see a complexity there which could just be 
avoided. Handing around normal ARGB32 bitmaps is easier to understand and 
handle. I still have my use-case of blending it onto a YUV video, where i would 
have to un-multiply the alpha before conversion, or it might end up being 
rather weird.

> So the reason why using a maxNumBitmaps option not combineBitmaps, is we 
won't lose anything by using maxNumBitmaps.

We also don't gain anything. Why would a renderer support rendering 100 small 
bitmaps, but not 101? I just don't see the advantage. Either you can process 
multiple small bitmaps, or you cannot. If you want to make the Alpha topic 
easier, just remove the ability for the sub-renderer to merge the images 
completely, and always let the consumer do that - that way there are no 
misunderstandings.

Original comment by h.lepp...@gmail.com on 29 Nov 2011 at 1:39

GoogleCodeExporter commented 8 years ago

> If you want to make the Alpha topic easier, just
> remove the ability for the sub-renderer to merge
> the images completely, and always let the consumer
> do that - that way there are no misunderstandings.

That would be fine with me, too.

Original comment by mad...@gmail.com on 29 Nov 2011 at 1:49

GoogleCodeExporter commented 8 years ago

Re: pre-multiplying alpha

Alpha in the pixels is 0-255, not 0-1, so when you multiply a 8-bit value with 
a 8-bit alpha, you will also need to add a division (or a right shift) to go 
back to 8 bits while pre-multiplying.

Original comment by h.lepp...@gmail.com on 29 Nov 2011 at 1:50

GoogleCodeExporter commented 8 years ago

More for maxNumBitmaps option:
I fear that sometime the series of bitmaps may go crazy long and the 
alphablending should be done on CPU. So even if the consumer allow a infinit 
series of bitmaps, I would like to put a upper limit on it internally in the 
subtitle renderer.
The attached subtitle is an example. The particle effect is created using ASS 
sript, not hardsub. Every small dot would be a bitmap in the series. I believe 
libass also output a bitmaps per small dot, if libass handles it right. A video 
renderer won't like so many small items (I guess?) In such situation, the 
subtitle renderer combine the small points first won't be a bad idea, will it?

Original comment by YuZhuoHu...@gmail.com on 29 Nov 2011 at 1:57

Attachments:

particle_effects.png

GoogleCodeExporter commented 8 years ago

> Alpha in the pixels is 0-255, not 0-1, 
> so when you multiply a 8-bit value with 
> a 8-bit alpha, you will also need to 
> add a division (or a right shift) to go 
> back to 8 bits while pre-multiplying.

Practically, 
   (RGB1*(256-A)+RGB2*(A+1))>>8
is using to save the division.

Original comment by YuZhuoHu...@gmail.com on 29 Nov 2011 at 2:01

GoogleCodeExporter commented 8 years ago

> that way there are no misunderstandings.

The misunderstanding here is indeed due to the alphablending's nature with 
different alpha format. Not the fault of the option. Of course if you want to 
do alphablending right, you should learn how to do it right.

Original comment by YuZhuoHu...@gmail.com on 29 Nov 2011 at 2:12

GoogleCodeExporter commented 8 years ago

That particle effect is nasty! You're right, having hundreds of 1 pixel sized 
RGBA bitmaps would not perform well with a GPU.

Hmmmmm... Maybe we shouldn't put a limit on the number of bitmaps, but rather 
on the size? So combining all bitmaps that are smaller than e.g. 2x4 pixels and 
which are nearer in position than 100 pixels would be combined by the subtitle 
renderer? Not sure.

Original comment by mad...@gmail.com on 29 Nov 2011 at 2:13

GoogleCodeExporter commented 8 years ago

> Implementing A1*B1+A2 would probably end up being A1*B1+A2*1, 
> just so that the same instruction could be used.

For A1*B1+A2, first use a _mm_mullo_epu16 to do the multiplication, then use a 
_mm_adds_epu16 to do the addition. With two intrinsics you mix 8 pix, while the 
corresponding _mm_madd_epi16 only allows you mix 4 pix. And inorder to get 
255-alpha using SSE, you'll use more instructions.

The right way to do it is:
 RGBbig = 0;
 Abig = 1;
 foreach(subpic)
   ARGBbig = (ARGBbig * (1-Asub) + RGBsub * Asub)
 end
 Abig = 1-Abig
 RGBbig = RGBbig / Abig

Ok. You can say the only benefit we get via pre-multiplied alpha here is now 
these two
 Abig = 1-Abig
 RGBbig = RGBbig / Abig
are saved. But why we design some complex addref release just to save one or 
two copy operations?

Original comment by YuZhuoHu...@gmail.com on 29 Nov 2011 at 2:35

GoogleCodeExporter commented 8 years ago

> But why we design some complex addref release just to save one or two copy 
operations?

The AddRef/Release is just the right thing to do when you exchange complex 
objects between filters, i don't expect to save anything with it. The Windows 
SDK has quite alot of helper classes to manage IUnknown objects, so its really 
not "complex" at all.

I still stand by my opinion that pre-multipled is not really offering any great 
advantages, and in the end would only require the subtitle renderer to support 
both formats, because there are operations which will require a pure RGBA image.
If you think its a good idea, i won't mind if its there, as long as the 
consumer has the option to specify which format it wants.

Original comment by h.lepp...@gmail.com on 29 Nov 2011 at 2:46

GoogleCodeExporter commented 8 years ago

My personal opinion is:

It would be ok with me if we offer an option for the consumer to specify that 
he wants pre-multiplied alpha/RGB. I would make it optional, though, not 
mandatory, for the subtitle renderer. I would also really like standard RGBA to 
be the default format because these days everybody expects RGBA to *not* be 
pre-multiplied. I don't know any Direct3D9 texture/surface format which 
supports pre-multiplied alpha. DXPMSAMPLE is stone age (DirectX 6.1).

Original comment by mad...@gmail.com on 29 Nov 2011 at 2:49

GoogleCodeExporter commented 8 years ago

> Maybe we shouldn't put a limit on 
> the number of bitmaps, but rather 
> on the size? So combining all 
> bitmaps that are smaller than e.g.
> 2x4 pixels and which are nearer 
> in position than 100 pixels would 
> be combined by the subtitle renderer?

I'm OK with this.

> If you think its a good idea, i 
> won't mind if its there, as long as 
> the consumer has the option to specify 
> which format it wants. 

Of course.

Original comment by YuZhuoHu...@gmail.com on 29 Nov 2011 at 3:07

GoogleCodeExporter commented 8 years ago

Here's the latest interface with all the changes that were suggested:

http://madshi.net/SubRenderIntf.h

Please check if you find anything you don't like. A few hints:

(1) The subtitle renderer is now responsible for making the connection. The 
header explains in detail how the subtitle renderer should do that. 
@YuZhuoHuang, can you live with it this way? I've changed it because the merit 
system would have been difficult to implement the other way round.

(2) I've renamed the "option"s to "field"s. I'm not sure about the best name 
for this. Some are really options, some are information fields. Does anybody 
have a better name for it?

(3) Please check the mandatory and optional "fields". Are you ok with the 
decisions I made there?

(4) Due to having the subtitle renderer initiate the connection now, I've been 
able to get along with one less interface. Please check if it makes sense to 
you the way it is now.

--------

Auto-loading: We need to find a definite solution for this. Which of the 
suggested approaches do you like most? Or maybe: Which do you dislike the least?

-------

Combining small RGBA bitmaps: Is it ok, if we leave the decision on which ultra 
small RGBA bitmaps to combine into bigger bitmaps to the subtitle renderer? Or 
should be define specific rules for that? Or do we need to allow control of 
this through the interface? FWIW, if the consumer is CPU based, too, it might 
not want/need the subtitle renderer to do any combination on its own. But for 
madVR, I wouldn't really want to implement SSE blending routines myself. I 
would like to rely on the subtitle renderer to provider me with subtitle 
bitmaps which are not so small that they will drag down GPU performance.

Thoughts?

Original comment by mad...@gmail.com on 29 Nov 2011 at 4:10

GoogleCodeExporter commented 8 years ago

[deleted comment]

GoogleCodeExporter commented 8 years ago

> Combining small RGBA bitmaps: Is it ok, ...

I think we must continue the premultiplied alpha discussion first. Indeed I 
don't really take alpha format serious at the beginning. Just dislike this 
convertion 
   RGBbig = RGBbig / Abig
(introduced if using source alpha) instinctively. But after more thoughts on 
it, I hate it even more. Not only for performance (though it is really slow, 
especially because it must be done on every RGB channel and there is no 
correponding SSE intrinsics for it), but also for quality: the error introduced 
by it can be really huge and serious.

1. The right way to do alphablending:
> RGBbig = 0;
> Abig = 1;
> foreach(subpic)
>    ARGBbig = (ARGBbig * (1-Asub) + RGBsub * Asub)
> end
At this point, you already get a pre-multiplied alpha ARGBbig, even you don't 
like it, it just works the way it works.
Then it'll be converted to source alpha:
> Abig = 1-Abig
> RGBbig = RGBbig / Abig

An integer translation:
> RGBbig = 0;
> Abig = 255;
> foreach(subpic)
>    ARGBbig = (ARGBbig * (256-Asub) + RGBsub * (Asub+1))>>8
> end
> Abig = 255-Abig
> RGBbig = RGBbig / (256-Abig)

2. Assumed:
(video) pix1 = R:0 G:0 B:0 (A:0)
(sub1 ) pix2 = R:0 G:0 B:0 A:0
(sub2 ) pix3 = R:250 G:250 B:250 A:254
(using source alpha)

A normal alphablending 
    pix1 = (pix1 * (256-Apix2) + RGBpix2 * (Apix2+1))>>8    
    pix1 = (pix1 * (256-Apix3) + RGBpix3 * (Apix3+1))>>8
gives
    pix1 = R:249 G:249 B:249,
almost pure white.

But if sub1 and sub2 need to be combined first, following the above procedure:
> RGBbig = 0;
> Abig = 255;
> foreach(subpic)
>    ARGBbig = (ARGBbig * (256-Asub) + RGBsub * (Asub+1))>>8
> end
we get ARGBbig = R:249 G:249 B:249 A:1. If using pre-multiplied alpha, we'll 
stop here and return ARGBbig to consumer. By doing
    pix1 = (pix1*Abig)>>8 + RGBbig (if Abig!=255)
or
    pix1 = (pix1*(Abig+1))>>8 + RGBbig
consumer gets 
    pix1 = R:249 G:249 B:249,
exactly what we want.
If using source alpha, here follows the evil convertion:
> Abig = 255-Abig
> RGBbig = RGBbig / (256-Abig)
After that,
    ARGBbig = R:0 G:0 B:0 A:254 
By doing
    pix1 = (pix1 * (256-Abig) + ARGBbig * (Abig +1))>>8 
consumer gets
    pix1 = R:0 G:0 B:0,
pure black. The keypoint is, with integer precision (X/Y)*Y may not equal to X, 
 it may introduce a max error of Y-1.
That is, in a particle effect, using source alpha may erase all the flying dots!
----------------------------------------
More on Premultiplied alpha and alphablending (I didn't read it carefully, just 
hope it may help)
http://home.comcast.net/~tom_forsyth/blog.wiki.html#[[Premultiplied%20alpha]]

Anyway, I really hope madshi you can dig more and find if there is anyway to 
support premultiplied alpha. At least give me a option to let madVR using 
pre-multiplied alpha，or else it'll become really hard (if not impossible) for 
me to handle the above error right. And if we can use premultiplied alpha only 
and forbid source alpha format, that would be the best, since it is source 
alpha format that introduces all these trouble.
----------------------------------------

Original comment by YuZhuoHu...@gmail.com on 30 Nov 2011 at 1:04

GoogleCodeExporter commented 8 years ago

> I don't know any Direct3D9 texture/surface format which supports 
> pre-multiplied alpha. DXPMSAMPLE is stone age (DirectX 6.1).

Even so, is there a particular reason it couldn't be used? Some googling seems 
to suggest that all texture/surface formats with alpha can use either 
premultiplied or non-premultiplied alpha. It's not explicitly specified, so you 
just need to be aware of which format alpha is stored and process it correctly. 
That's something which is easily solved by just adding a variable for alpha 
format type to the interface, isn't it?

Direct3D 9 (D3D9Types.h) still seems to support alpha blending using 
premultiplied alpha:

    // Linear alpha blend with pre-multiplied arg1 input: Arg1 + Arg2*(1-Alpha)
    D3DTOP_BLENDTEXTUREALPHAPM  = 15, // texture alpha
    D3DTOP_BLENDCURRENTALPHA    = 16, // by alpha of current color

Original comment by cyber.sp...@gmail.com on 30 Nov 2011 at 4:44

GoogleCodeExporter commented 8 years ago

----------------
> Abig = 255-Abig
> RGBbig = RGBbig / (256-Abig)
Found that I was wrong again on this. All calculations follow it were wrong 
too. Sorry for the confusion.
The right integer translation should be:
> Abig = 255-Abig
> RGBbig = (RGBbig<<8) / (256-Abig)
After the convertion:
    ARGBbig = R:249 G:249 B:249 A:254 
Do a normal alphablending
    pix1 = (pix1 * (256-Abig) + ARGBbig * (Abig +1))>>8 
Get
    pix1 = R:248 G:248 B:248,
The error is not that unacceptable, and maybe can still be reduced by adding a 
extra rounding when do the convertion, like:
> RGBbig = ((RGBbig<<9) / (256-Abig) + 1)>>1
----------------
But anyway, I still expect madshi can find a way to support pre-multiplied 
alpha (or even support a series that mix with pre-multiplied alpha and source 
alpha, if you don't like the rounding error introduced by pre-multiplied). 
Pre-multiplied alpha really makes life easier (and better? 
http://www.youtube.com/watch?v=dU9AXzCabiM).
----------------

Original comment by YuZhuoHu...@gmail.com on 30 Nov 2011 at 5:48

GoogleCodeExporter commented 8 years ago

If you just store the intermediates in 16-bit instead of limiting yourself to 
8-bit, the calculation error goes away (or rather, becomes so small that it can 
be ignored).
This is also true for using pre-multiplied alpha - no matter what you do, you 
should store the intermediate alpha blending results in 16-bit integer instead 
of 8-bit, or on subsequent blend operations you will lose data, quite a lot of 
it, too.
You would then after the whole blend operation just convert everything back to 
8-bit, possibly even with dithering for the highest quality.

One thing is for sure, though: I will need un-multiplied RGBA, and the consumer 
should have full control over what format it gets. I would also agree with 
madshi that unmolested RGBA should be the default format, because its just more 
common (but in the end, i don't care about the default format, because i can 
choose which i need).

Original comment by h.lepp...@gmail.com on 30 Nov 2011 at 6:46

GoogleCodeExporter commented 8 years ago

Something else i noticed, shouldn't it be 255-Abig instead of 256? Using 256 
could account for the minor error you still have in there.

Original comment by h.lepp...@gmail.com on 30 Nov 2011 at 6:54

GoogleCodeExporter commented 8 years ago

Ok i updated my formula, simplified it a bit, and adjusted it for 0-255 Alpha 
instead of 0-1

 RGBbig = 0
 Abig = 255
 foreach(subpic)
   RGBbig = (RGBbig * (255-Asub) + RGBsub * Asub) >> 3
   Abig = Abig * 255-Asub
 end
 Abig = 255 - Abig
 if (Abig > 0)
   RGBbig = (RGBbig << 3) / Abig

Using this, i do get perfectly accurate results in your calculation example.
A subtitle with alpha = 0 is a bad example, though, because it does not change 
RGBbig/Abig at all - so its really only one blend operation.

For more complex examples, like actually blending 3 different layers of 
subtitles on top of each other (all with alpha > 0), you will get calculation 
errors, however those can be avoided by simply increasing the bit depth of 
RGBbig/Abig to 16-bit.

Original comment by h.lepp...@gmail.com on 30 Nov 2011 at 7:11

GoogleCodeExporter commented 8 years ago

The shift by 3 should obviously be a shift by 8 <.< I need more coffee

Original comment by h.lepp...@gmail.com on 30 Nov 2011 at 7:12

GoogleCodeExporter commented 8 years ago

> shouldn't it be 255-Abig instead of 256
No. It is right. Because 0 stands for transparent and 255 stands for opaque. 
(RGB*255)>>256 = RGB - 1 (if RGB>0)
will introduce some big troubles. E.g. mostly the text background is totally 
transparent, that -1 error will generate a visible box arround the text.

Original comment by YuZhuoHu...@gmail.com on 30 Nov 2011 at 7:24

GoogleCodeExporter commented 8 years ago

[deleted comment]

GoogleCodeExporter commented 8 years ago

> If you just store the intermediates in 16-bit 

It can be done in float point too. -_-!

>   RGBbig = RGBbig / Abig
>   pix1 = ... + RGBbig * Abig
I don't think it is that hard to find out that source alpha is forcing you 
doing something non-sense.

> In LAV Video, it would also be quite 
> possible that i would need to blit 
> onto a YUV image, which means i would 
> need to convert the RGBA to YUVA - if 
> its still plain RGBA, i can just send
> it through my existing setup for 
> conversion to YUV.
Convert a RGBA data with premultiplied alpha into YUVA has very few (if any) 
difference with a normal RGBA -> YUVA convertion. VSFilter's old code do it 
right. xy-VSFilter even output YUVA with premultiplied alpha directly. It all 
works right.
By the way, premultiplied alpha is nothing new and really widely used (in 
flash/photoshop).

Original comment by YuZhuoHu...@gmail.com on 30 Nov 2011 at 7:42

GoogleCodeExporter commented 8 years ago

Hmmmmm... After reading this article I'm not sure what to think:

http://home.comcast.net/~tom_forsyth/blog.wiki.html#[[Premultiplied%20alpha]]

I don't care much about the bilinear scaling problems because the way we 
designed the interface, I expect the subtitle renderer to provide me with 
subtitles which are already scaled to the target output resolution, so I don't 
have to rescale them. However, the "Compositing translucent layers" part of the 
article seems to indicate that using premultiplied alpha is actually more 
correct, mathematically, when combining multiple transparent layers, before 
rendering them on the final image. Ok, that's not what I'm doing, either, but 
it is what vsfilter, libass and probably also LAV Video Decoder will have to 
do. @Hendrik, can you please double check that part of the article and confirm?

I didn't know D3DTOP_BLENDTEXTUREALPHAPM. It defines the operation as:

S(RGB) = Arg1 + Arg2 * (1 - alpha)

Is this what would be needed for using premultipled alpha on the GPU? It seems 
so to me, but it's too early in the morning for me to be sure.

I would really hate to get a mixture of source alpha and premultiplied alpha. I 
guess I could live with either one (provided D3DTOP_BLENDTEXTUREALPHAPM is 
alright, please confirm), but all subtitle bitmaps should share the same alpha 
type.

Original comment by mad...@gmail.com on 30 Nov 2011 at 8:01

GoogleCodeExporter commented 8 years ago

> S(RGB) = Arg1 + Arg2 * (1 - alpha)

It is ok.

Original comment by YuZhuoHu...@gmail.com on 30 Nov 2011 at 8:09

GoogleCodeExporter commented 8 years ago

> Because 0 stands for transparent and 255 stands for opaque. (RGB*255)>>256 = 
RGB - 1 (if RGB>0)

Exactly, the value range is 0-255, i don't get where any of the 256's come 
from, its not even a 8-bit number anymore, and would overflow the number range.

See comment 120, i used 255 everywhere, and the result is exactly right.
If the Alpha is 0, the inverted value (255-Alpha) should be 255, and not 256 - 
the same for the other extreme, when the Alpha is fully opaque (255), the 
inverted value should be 0, and not 1, otherwise stuff bleeds through.

> @Hendrik, can you please double check that part of the article and confirm?

The article seems to make sense, i guess.
After finally managing to read the article (somehow it didn't show any text on 
the company PC, had to read it on my phone <.<), using pre-multiplied seems 
like the correct thing to do, assuming its easy enough to use for alpha 
blending in hardware.

Original comment by h.lepp...@gmail.com on 30 Nov 2011 at 8:31

GoogleCodeExporter commented 8 years ago

> i don't get where any of the 256's come from
256 is a typo.

(RGB*(255-Alpha)+Alpha*what_ever)>>8 
= (RGB*255)>>8
= (RGB*256 - RGB)>>8,
if Alpha=0.
Assumed you're using 8bit, then if RGB>0
(RGB*256 - RGB)>>8 = RGB - 1
if you're not,
(RGB*256 - RGB)>>8 = RGB - RGB_hibit_then_8 - (RGB_lo_8bit>0)
that is even hard to ajust back to RGB.
So if you're using 16bit, it should at least done in this way:
(RGB*(0xFFFF-Alpha*256)+Alpha*256*what_ever)>>16 = (RGB*0x10000 - RGB)>>16
then you get
RGB - 1, if RGB>0,
which is more easy to ajust back to RGB.

While if Alpha=255, there is another error in the above:
(RGB*(255-Alpha)+Alpha*what_ever)>>8 
= (what_ever*255)>>8
= what_ever - 1, if what_ever>0.

Original comment by YuZhuoHu...@gmail.com on 30 Nov 2011 at 8:49

GoogleCodeExporter commented 8 years ago

Weird, the article shows just fine on my PC, using Firefox 9 Beta.

So after reading the article, is there a consensus to use premultiplied alpha 
now? If so, probably we should simply fix it to that and not allow "standard" 
alpha at all, to make the interface at least simple. I don't really like going 
against the normal RGBA format, but it seems to be the more correct thing to 
do, so we'll have to swallow that pill, I guess?

Original comment by mad...@gmail.com on 30 Nov 2011 at 10:17

GoogleCodeExporter commented 8 years ago

Any more comments on the premultiplied topic?

Any comments on the latest version of the interface?

Original comment by mad...@gmail.com on 4 Dec 2011 at 9:48

GoogleCodeExporter commented 8 years ago

Would there be any advantage of adding optional AYUV support to the interface?

Original comment by cyber.sp...@gmail.com on 4 Dec 2011 at 12:26

GoogleCodeExporter commented 8 years ago

While AYUV may have a (very minor) advantage in 1-2 obscure situations, i don't 
think its a good idea to further increase the complexity.

What might be interesting is if the video renderer can supply the subtitle 
renderer with the info which RGB matrix to use, in case it has to convert YUV 
subtitles to RGB.

Otherwise, the latest interface seems simple enough to use, and pretty much 
complete for our requirements.

Original comment by h.lepp...@gmail.com on 4 Dec 2011 at 4:50

GoogleCodeExporter commented 8 years ago

So what is the decision on alpha? Should we use premultipled alpha, only? That 
would make the interface simpler. Or should be offer an option? If so, which 
should be the default?

Are there situations where the subtitle renderer needs to convert YUV subtitles 
to RGB? Should be no problem to add an information "field" for that. Hmmmmm... 
What happens if the source is encoded in e.g. YCgCo? I'm wondering whether it's 
"correct" to use YCgCo for subtitle YUV -> RGB conversion, then? I guess the 
subtitle writers probably will have used BT601 or BT709 instead? Or maybe not...

Original comment by mad...@gmail.com on 4 Dec 2011 at 5:00

GoogleCodeExporter commented 8 years ago

Unless nevcairiel has any objections, make it use premultiplied alpha only.

Original comment by cyber.sp...@gmail.com on 4 Dec 2011 at 6:33

GoogleCodeExporter commented 8 years ago

I'd also go a step further and specify that the video renderer is not allowed 
to 'unmultiply' the alpha before performing alphablending.

Original comment by cyber.sp...@gmail.com on 4 Dec 2011 at 6:36

GoogleCodeExporter commented 8 years ago

Its fine, i guess. Easy enough to convert back on the consumer side should the 
need ever arise.

Original comment by h.lepp...@gmail.com on 4 Dec 2011 at 6:36

GoogleCodeExporter commented 8 years ago

> I'd also go a step further and specify that the video renderer is not allowed 
to 
> 'unmultiply' the alpha before performing alphablending.

What the video renderer does or does not do is really his business, imho.
The comment could specify that its adviced to apply it directly without 
un-multiplying, but really, any developer not smart enough to see that he would 
do the same calculation backwards again when blending, they would not listen to 
a comment anyway.

Original comment by h.lepp...@gmail.com on 4 Dec 2011 at 6:50

GoogleCodeExporter commented 8 years ago

[deleted comment]

GoogleCodeExporter commented 8 years ago

I still need feedback for the various questions in Comment #113.

Original comment by mad...@gmail.com on 4 Dec 2011 at 6:55

GoogleCodeExporter commented 8 years ago

Supposedly, if you unmultiply premultiplied alpha and then proceed to 
alphablend as normal alpha, it causes quality problems. Maybe it would be best 
to change it up a bit in a attempt to discourage this from occurring.
___
Remove the providers field for premultiplied alpha and make it the default. In 
this way, all subtitle renderers supporting this interface will be required 
support premultiplied alpha.

In its place add a consumer bool field for stdRGBAonly for use when the 
consumer doesn't support premultiplied alphablending. If set to TRUE, the 
connection would be refused by the provider unless standard RGBA output is 
supported. 
___

Original comment by cyber.sp...@gmail.com on 4 Dec 2011 at 7:37

GoogleCodeExporter commented 8 years ago

For premultiplied alpha:
I'd like to use premultiplied alpha only, because it simplifies both sides' 
work and premultiplied alpha is better for alphablending purpose. For consumers 
that need to do a extra convertion (like argb2ayuv or scaling (specially uv 
downsample)) on the bitmaps, premultiplied alpha is better too. And anyway, it 
is possible for the consumer to convert premultiplied alpha to a normal alpha 
format too.

> The subtitle renderer is now responsible for making the connection
Ok (though I'm not really sure how to do that right now). 

> Auto-loading
Consumer search the graph for the subtitle renderer, if not found, load one and 
the subtitle renderer carries the detection work after loading. 

> Combining small RGBA bitmaps 
I think the only way for the consumer to force the subtitle renderer combining 
bitmaps is give a small upper bound on the bitmap series. A limit on the 
bitmaps' size won't help. Because the subtitle renderer may *resize* every 
bitmap to the the limit, instead of combining them (this makes sense when there 
is only one(or very few) small bitmap(s) ).

Original comment by YuZhuoHu...@gmail.com on 5 Dec 2011 at 4:47

GoogleCodeExporter commented 8 years ago

Combining small RGBA bitmaps:

I wouldn't know which number of max subtitles to give to the subtitle renderer. 
That seems like a random choice to me. I think we need to find a better way to 
optimize performance. Ideally, I'd like to leave this up to the subtitle 
renderer. Can't you think of some clever algorithm which e.g. combines all 
subtitle elements which are smaller than 4x8 and which are nearer in position 
than 40x80 pixels into one subtitle element? The consumer simply doesn't have 
any information about these kind of things. E.g. imagine a particle effect with 
50 pixel sized bitmaps, and imagine another particle effect with 200 pixel 
sized bitmaps. If the consumer sets a limit to 100 bitmaps, the particle effect 
with 50 pixel sizes bitmaps would still get through uncombined. If I set the 
limit to 10 bitmaps and there are no particle effects, the subtitle renderer 
will have to do more combination work than would be good. I think the subtitle 
renderer should decide for itself, by using some clever algorithm.

Hmmmm... I think it might be best for the subtitle renderer to combine all 
subtitle bitmaps which are near in position, regardless of size. This way if 
there's a subtitle line at the top of the screen, and another subtitle line at 
the bottom of the screen, the subtitle renderer would provide the consumer with 
just 2 bitmaps.

Thoughts?

Original comment by mad...@gmail.com on 5 Dec 2011 at 8:50

GoogleCodeExporter commented 8 years ago

@113:

> (1) The subtitle renderer is now responsible for making the connection.
Looks good!

> (2) I've renamed the "option"s to "field"s. I'm not sure about the best name 
for this.
Its just names, i don't care. Its OK as is.

> (3) Please check the mandatory and optional "fields".
Make refreshRate a mandatory field, but keep the "0 if unknown" clause, that 
already makes it semi-optional, otherwise you have to check two error cases.
Whats the intended return value if a optional field is not supported - just 
E_FAIL?

I would also do away with the preMultiply option and agree with YuZhuoHuang to 
only support one format.

> (4) Due to having the subtitle renderer initiate the connection now, I've 
been able to get along with one less interface.
Looks good.

> Auto-loading: We need to find a definite solution for this. Which of the 
suggested approaches do you 
> like most? Or maybe: Which do you dislike the least?

Since i generally never have external subtitles, i'm fine with the one that 
involves me doing no work. :)
However, i do have some plans to allow LAV Splitter to load post-processors to 
help "dumb" players to build a proper graph. Not sure how filters without and 
input/output pins will behave, though - some events generally travel through 
the pins, and without any pins, it might miss out. No way to know before we 
try, though.

> Combining small RGBA bitmaps

I would just leave it to the sub renderer, and not set any limit.
The comment could specify that its recommended to combine very small bitmaps, 
and warn about the performance penalty.
libass for example provides a very long list of bitmaps in one color each (1bit 
alpha map + 32-bit color), it would be insane to send them as-is, so i would 
always combine them into bigger bitmaps (maybe one, maybe multiple if the gab 
between zones is big enough)

Original comment by h.lepp...@gmail.com on 5 Dec 2011 at 9:36

GoogleCodeExporter commented 8 years ago

Good, I think we're nearly finished now.

@Hendrik, not sure about error values, I guess any that results in 
"FAILED"/"SUCCEEDED" to work as expected would be ok with me. If you have any 
preferences for specific error values I'd be fine with adding that to the 
header.

@YuZhuoHuang, could you live with the "let the subtitle renderer decide for 
itself how to best combine small bitmaps"? It seems both Hendrik and I would 
prefer it that way.

About auto-loading, we let the consumer load the subtitle renderer manually. 
Still, two options remain:
(1) Consumer finds the subtitle renderer by looking at a specific registry 
value.
(2) Consumer finds the subtitle renderer by remembering which subtitle renderer 
was used the last time. This would start working after the first time a file 
with an *internal* subtitle track was played.

I'd be fine with both approaches. Opinions?

Original comment by mad...@gmail.com on 5 Dec 2011 at 10:08

GoogleCodeExporter commented 8 years ago

> "let the subtitle renderer decide for itself how to best combine small 
bitmaps"

Ok. That's the only way, maybe.

Original comment by YuZhuoHu...@gmail.com on 5 Dec 2011 at 11:24

GoogleCodeExporter commented 8 years ago

Something else came to me right now, not sure if/how relevant this is.

How about multiple subtitle providers? Did we cover that? I briefly glimpsed 
through the 145 posts, but couldn't find anything.

Rationale:
A subtitle renderer might only support one specific format, but is the 
preferred renderer for this format. Another subtitle format (an external file, 
maybe) triggers a second subtitle renderer to load. Maybe, for some arcane 
reason, the user wants to see both subtitles.

Is this a reasonable use-case? Is the interface supposed to handle multiple 
providers, and if its not, would it make sense to adjust it to?

I'm not sure its a useful requirement, i just wanted to bring it up because i 
don't think it was mentioned before (or i missed it).
I'm fine with dismissing it as useless or too complicated.

Original comment by h.lepp...@gmail.com on 5 Dec 2011 at 12:30

GoogleCodeExporter commented 8 years ago

> multiple subtitle providers

Maybe this can be supported with this interface too.
A subtitle manager support this interface both as consumer and subtitle 
renderer can be implemented to deal with all subtitle format. It can use 
several other subtitle renderers internally to deal with different subtitle 
format. It delivers all (user selected) subtitles to corresponding internal 
subtitle renderers, and get subpic results back as consumer. Then it *combines* 
all the results together and return that to its consumer.

Original comment by YuZhuoHu...@gmail.com on 5 Dec 2011 at 1:07

GoogleCodeExporter commented 8 years ago

I like the idea of a "manager", all thats really needed to support this in the 
interface is a new merit for the manager, higher then the video renderer. 
Everything else seems to be possible already.

A "normal" consumer would just fail on connection of a second subtitle 
renderer, refusing the connection, but if it implements the manager feature, it 
would allow multiple connections.

Original comment by h.lepp...@gmail.com on 5 Dec 2011 at 1:46

GoogleCodeExporter commented 8 years ago

Autoloading: Unless there are different votes, I've decided to use a registry 
value for the auto-loading.

Multiple subtitle providers: I see two options here:

(1) We could leave it up to the consumer to support multiple providers. In the 
latest interface we're letting the subtitle renderer initiate the connection. 
If multiple subtitle renderers try to connect to the same consumer, the 
consumer could accept them all and render all provided subpics from all 
providers. I see no major problem with that, and it would work with the 
interface as it is. Not sure if I would implement that in madVR, but it would 
be the decision of the consumer whether it wants to support multiple providers 
(at the same time) or not.

(2) Someone could write a "manager", as suggested by YuZhuoHuang. The manager 
should then be written to that registry value for auto-loading. The manager 
could have its own settings dialog where the user can choose a subtitle 
provider for external subtitle tracks. So the manager would be auto-loaded 
through the registry key, and would then auto-load further subtitle providers 
as needed.

The disadvantage of solution (1) would be that every consumer would have to add 
extra code to support multiple providers. Solution (2) would mean that all 
consumers only need to support one provider, which would make consumer 
programming easier. But somebody would have to invest the time to develop a 
manager.

Thoughts?

Original comment by mad...@gmail.com on 5 Dec 2011 at 1:47

GoogleCodeExporter commented 8 years ago

Its not really anything new, its just a high-priority consumer that combines 
all sources it finds and forwards to the real consumer (video renderer).

I would just stick with the idea, maybe one day we find time to implement it in 
a separate filter, or maybe xy-vsfilter implements it, offering stream 
switching and the works. Its not a crucial feature, and nothing that really 
requires changes to the interface.

Original comment by h.lepp...@gmail.com on 5 Dec 2011 at 1:59

bahamut8348 / xy-vsfilter

Custom interface for rendering subtitles on an RGBA texture(s) | (XySubFilter for madVR) [Part 1] #40