Closed brabebhin closed 6 years ago
@mcosmin222 have u tested the TimedMetadataStreamDescriptor which is in the preview SDK? what are its limitations?
I have and it doesn't look good. The API seems rather incomplete. Another thing I don't like about it is that it is not supported in earlier versions of Windows, so if we were to implement a push model, it would be backwards compatible as well.
The way I imagine it is to start a parallel run to extract the subtitle stream and pass it to TimedMetadatSource.
OK, we can get the subtitle data directly from packages belonging to the subtitle streams. The buf member contains the text cue data
I will also make an educated guess that pos is the presentation timestamp and dur is the length of time the cue will stay visible, so we can deduce the StartTime and EndTime from that. Technically, we can push cues inside a TimedMetadataTrack.
The MediaStreamDescriptors are not only used for playback but also for encoding and transcoding. They have other features that do not work for playback, such as Label and Language. If TimedMetadataStreamDescriptor was intended for subtitles then I would expect contstructors to create them for a subtitle format, like AudioStreamDescriptor and VideoStreamDescriptors. But there are no such constructors. And there are also no MediaSubtypes defined for any of the subtitle formats. So to me it does not look like they made this for MediaStreamSource playback but rather for some encoding stuff.
And as I said, I have doubts if they can work with pull model. What duration should we set for the mock frame, when we do not know when the next subtitle might come? We might set too long duration and then miss a subtitle. Pull model does not really work for a stream that does not continuously provide samples.
Extracting all subtitles from a file would take a considerable amount of time. No one wants to wait 20 seconds before playback even starts.
I am currently busy with other things, so I probably won't be working on this anytime soon. But this is what I had in mind: We could add an event like SubtitleDecoded, which will fire each time a subtitle packet is read. The event would have both a String and a IBuffer property, since ffmpeg supports both text and image subtitle formats. Then we can write a C# companion lib which handles the subtitle events and adds the subtitles on-the-fly to a TimedMetadataTrack. I have already done some experiments with TimedMetadataTrack a few months back. I could add subtitles on the fly to display during playback, so this sould definitely work. But we'd need to parse the subtitles since and get the plain text out of them. I think that would be difficult in C++, so I would rather do that in a C# lib.
@lukasf ,I agree with the event idea. I was thinking of something roughly similar. I am currently working on striping the subtitles (text so far) from ffmpeg and trigger an event handler or something with it.
@mcosmin222 The subtitle packets contain timestamp and duration of the subtitles. I already inspected the packets of a few sample files when I did my testing back then. Each packet contains one subtitle entry and the packets are usually muxed a few seconds before video, so we should have enough time to add them as TimedTextCue.
Maybe we should still wait a while for the final SDK to be out? It would be bad if we start work now and then find out that MS has added the missing APIs in the final version. But I think the chances are rather low.
Oh you already started. Yeah, maybe we should just roll our own solution. I don't have so much time right now, but maybe I can also help out a bit.
@lukasf If I am not mistaken, even if Microsoft were to update the SDK (highly unlikely), we would still need the demux part, and we would replace the event with feeding the sample instead. So it is not like we are really wasting time...
Yeah, we'd need the same infrastructure in the interop lib, the sample providers for subtitles.
@lukasf It seems we only need a strong typed CompressedSampleProvider - call it SubtitlesProvider, and its QueuePackage method should just fire the event. There is no decoding happening here, at least for text based subtitles. Not sure about the bitmap based subtitles, but in that case we could create a different class for it.
Yes, agreed.
I just put a question up in the msdn forums about TimedMetadataStreamDescriptor. I will post here if I get something form MS side.
OK, looks like I might have some rudimentary thing going. I will merge with your latest changes and push it to the open PR.
Some more food for thought: maybe we do not need the C# library, nor play around with the event handlers: we could expose a TimedMetadataTrack from a SubtitleStreamInfo (we will have to do this anyway). We can push the TimedTextSamples directly onto it inside the library, and the user will get to use the API in a safer way, and we do not alienate potential C++ consumers.
I am still not sure about the parsing of image part. But maybe ffmpeg is capable of doing that too?
Cool, looks good! I would have done this as a separate PR but it does not matter much.
Yes we could also do this all in C++, it would sure be easier to use. I am just not sure how easy it is to parse the subtitle text in C++. I am more a C# developer, I do not know all the C++ APIs. But maybe it would be a good exercise.
I have just searched a bit on image based subtitles and I currently don't think that ffmpeg can do the image decoding. There is a subtitle filter, but it only seems to work with external files. The images are encoded in some kind of palettized bitmap (usually 4bit), but I could not find the exact specifications for that. Maybe we should just concentrate on text formats first, they are also the most common formats.
Maybe manually converting the bmp to a WriteableBitmat (which is required for ImageCues) is not even that hard. We have the 16 palette colors from codec extradata (4 bit), then we just have to read the bmp lines and translate the 4bit value to the corresponding rgb value.
https://matroska.org/technical/specs/subtitles/images.html
Still, it's probably second setup after text subs work.
One thing that was a bit difficult with TimedTextCue is the formatting of the text. The documentation of the vast amount of properties is extremely poor. It is really hard to understand what values are needed for style and positioning, to get the text display correctly below the image (by default it appears in top left corner). There is no simple alignment/margin as you would expect. What I did when I was testing this was to load an external ass file and inspect the style and stuff in the loaded TimedTextCues, then I applied the same values to my custom cues. This worked well for initial testing, but maybe the values depend on the image size/aspect.
I would have expected the system to handle the styles by itself since the user can change properties in settings. This might get tricky.
Ok, it seems we only need to set the alignment (to the bottom of the screen). Although I'd rather not hard code this but pass it through the Config file, so applications can customize this as they see fit. We can default it to the bottom middle of the screen. Looks like the system is handling the rest of the properties.
I am wondering if this is a bug, it looks weird that the system initializes all the styles properly but forgets the alignment
When I tried it, I had no luck with alignment, but if you got it working, then that's the best option. I think it would be ok to hard code to bottom, and then add a vertical offset property and maybe scale (font size). But you can add more properties if you think it makes sense.
Or how about this: Just add a SubtitleCueStyle and SubtitleCueRegion property to config class. If users assign values, use them, and if they are empty, use our default. That way, we don't have to duplicate all individual properties.
I haven't managed to align it yet. I just got the text rendered.
I added a question to MSDN....
Usually these guys aren't really helpful for advanced topics - they also have no docs to work with so can't really blame them - but maybe we are missing something.
After fiddling for an hour, I now had success with the set below. Text is displayed on bottom (Padding) and centered horizontally (Style).
var cue = new TimedTextCue
{
Id = "Test",
StartTime = TimeSpan.Zero,
Duration = TimeSpan.FromSeconds(10),
CueRegion = new TimedTextRegion
{
Extent = new TimedTextSize { Unit = TimedTextUnit.Percentage, Width = 100, Height = 100 },
Position = new TimedTextPoint { Unit = TimedTextUnit.Pixels, X = 0, Y = 0 },
DisplayAlignment = TimedTextDisplayAlignment.Before,
Background = Windows.UI.Colors.Transparent,
ScrollMode = TimedTextScrollMode.Rollup,
TextWrapping = TimedTextWrapping.Wrap,
WritingMode = TimedTextWritingMode.LeftRightTopBottom,
IsOverflowClipped = true,
ZIndex = 0,
LineHeight = new TimedTextDouble { Unit = TimedTextUnit.Percentage, Value = 100 },
Padding = new TimedTextPadding { Unit = TimedTextUnit.Percentage, Start = 90 },
Name = ""
},
CueStyle = new TimedTextStyle
{
FontFamily = "default",
FontSize = new TimedTextDouble { Unit = TimedTextUnit.Percentage, Value = 100 },
LineAlignment = TimedTextLineAlignment.Center,
FontStyle = TimedTextFontStyle.Normal,
FontWeight = TimedTextWeight.Normal,
Foreground = Windows.UI.Colors.White,
Background = Windows.UI.Colors.Transparent,
//OutlineRadius = new TimedTextDouble { Unit = TimedTextUnit.Percentage, Value = 10 },
OutlineThickness = new TimedTextDouble { Unit = TimedTextUnit.Percentage, Value = 5 },
FlowDirection = TimedTextFlowDirection.LeftToRight,
OutlineColor = Windows.UI.Colors.Black,
},
};
When ScrollMode=Popon (which is default), I always got ArgumentException. Pretty weird that the default does not seem to work. That took me quite some painful time to find out.
OK. I shall see how to add that in the code. Thanks ^^
Yeah sorry I used C# for simplicity :)
np, it shouldn't take too long.
auto CueRegion = ref new TimedTextRegion();
TimedTextSize extent;
extent.Unit = TimedTextUnit::Percentage;
extent.Width = 100;
extent.Height = 100;
CueRegion->Extent = extent;
TimedTextPoint position;
position.Unit = TimedTextUnit::Pixels;
position.X = 0;
position.Y = 0;
CueRegion->Position = position;
CueRegion->DisplayAlignment = TimedTextDisplayAlignment::Before;
CueRegion->Background = Windows::UI::Colors::Transparent;
CueRegion->ScrollMode = TimedTextScrollMode::Rollup;
CueRegion->TextWrapping = TimedTextWrapping::Wrap;
CueRegion->WritingMode = TimedTextWritingMode::LeftRightTopBottom;
CueRegion->IsOverflowClipped = true;
CueRegion->ZIndex = 0;
TimedTextDouble LineHeight;
LineHeight.Unit = TimedTextUnit::Percentage;
LineHeight.Value = 100;
CueRegion->LineHeight = LineHeight;
TimedTextPadding padding;
padding.Unit = TimedTextUnit::Percentage;
padding.Start = 90;
CueRegion->Padding = padding;
CueRegion->Name = "";
auto CueStyle = ref new TimedTextStyle();
CueStyle->FontFamily = "default";
TimedTextDouble fontSize;
fontSize.Unit = TimedTextUnit::Percentage;
fontSize.Value = 100;
CueStyle->FontSize = fontSize;
CueStyle->LineAlignment = TimedTextLineAlignment::Center;
CueStyle->FontStyle = TimedTextFontStyle::Normal;
CueStyle->FontWeight = TimedTextWeight::Normal;
CueStyle->Foreground = Windows::UI::Colors::White;
CueStyle->Background = Windows::UI::Colors::Transparent;
//OutlineRadius = new TimedTextDouble { Unit = TimedTextUnit.Percentage, Value = 10 },
TimedTextDouble outlineThickness;
outlineThickness.Unit = TimedTextUnit::Percentage;
outlineThickness.Value = 5;
CueStyle->OutlineThickness = outlineThickness;
CueStyle->FlowDirection = TimedTextFlowDirection::LeftToRight;
CueStyle->OutlineColor = Windows::UI::Colors::Black;
Unfortunately it is not working correctly, if there is more than 1 line, the text gets cut off. I will upload the code anyway for you to test it. There is also a weird interaction between the custom media transport controls and the text. The text does not go "up" when the controls popup, and the controls cut it off.
Regardless, we are closer to this than we have ever been so far. Perhaps maybe we should ping some help from Microsoft somehow?
Okay I could fix it. Removed Padding and used DisplayAlignment = After. I also added a very primitive SSA/ASS parser.
New idea: What about replacing (or deprecating) GetMediaStreamSource with GetMediaPlaybackItem, with the item already initialized with subtitle tracks? We could then even remove TimedMetadataTrack from the subtitle info.
That is actually a pretty good idea.
But we would need to add a StartTime and EndTime for the playback item in the config.
using MediaPlatbackitem seems more reasonable because it works great with MediaPlayerElement and also has lot of other features to make everything simpler in an app.
At the same time, it cannot be used for transcoding, for example. We can have both.
having both ? how would that work?
Valid point @mcosmin222. It is not a problem to have both. About StartTime and EndTime, they can be set on the MediaPlaybackItem. I do not see a need to add them to our config class.
@lukasf StartTime is read only
Actually, we need a GetPlaybackItem(TimeSpan? start, TimeSpan? duration) method and that is it. No need for configs. But we still need to store the TimedMetadataTracks somewhere.
@touseefbsb , it would work like this: If you need the MediaStreamSource on its own, you call GetMediaStreamSource. If you need a playback item, you call GetPlaybackItem. Of course, you could construct the playback itself yourself, it is just a helper method actually.
so we can play the file with streamsource, but to use the subtitle feature we need to use getplaybackitem as well?
You either use MediaStreamSource or MediaPlaybackItem. If you want subtitles, you have to use the latter.
I will push a changeset in a few minutes.
Ok update is there. I called the method CreateMediaPlaybackItem (because it always creates a new instance) and added two overloads for StartTime/DurationLimit. I wanted to avoid adding these to config class. The config class currently only contains generic decoding settings. So one instance can be used for all files. If we'd add file specific stuff, it would force to create new instances for every file, which I'd like to avoid. Let me know what you think.
I also updated the samples to use them. Ah and I also automatically fixup audio stream names when MediaPlaybackItem is used. This also works now in the samples.
It would be nice if we could automatically translate the three letter language tags into full names, because often only language is set for a subtitle stream but not a name.
public static class LanguageCodes
{
private static List<LanguageCode> _codes;
static LanguageCodes()
{
_codes = new List<LanguageCode>()
{
//new LanguageCode("alpha2","English"),
new LanguageCode("aar","aa","Afar"),
new LanguageCode("abk","ab","Abkhazian"),
new LanguageCode("afr","af","Afrikaans"),
new LanguageCode("aka","ak","Akan"),
new LanguageCode("alb","sq","Albanian"),
new LanguageCode("amh","am","Amharic"),
new LanguageCode("ara","ar","Arabic"),
new LanguageCode("arg","an","Aragonese"),
new LanguageCode("arm","hy","Armenian"),
new LanguageCode("asm","as","Assamese"),
new LanguageCode("ava","av","Avaric"),
new LanguageCode("ave","ae","Avestan"),
new LanguageCode("aym","ay","Aymara"),
new LanguageCode("aze","az","Azerbaijani"),
new LanguageCode("bak","ba","Bashkir"),
new LanguageCode("bam","bm","Bambara"),
new LanguageCode("baq","eu","Basque"),
new LanguageCode("bel","be","Belarusian"),
new LanguageCode("ben","bn","Bengali"),
new LanguageCode("bih","bh","Bihari languages"),
new LanguageCode("bis","bi","Bislama"),
new LanguageCode("bos","bs","Bosnian"),
new LanguageCode("bre","br","Breton"),
new LanguageCode("bul","bg","Bulgarian"),
new LanguageCode("bur","my","Burmese"),
new LanguageCode("cat","ca","Catalan; Valencian"),
new LanguageCode("cha","ch","Chamorro"),
new LanguageCode("che","ce","Chechen"),
new LanguageCode("chi","zh","Chinese"),
new LanguageCode("chu","cu","Church Slavic; Old Slavonic; Church Slavonic; Old Bulgarian; Old Church Slavonic"),
new LanguageCode("chv","cv","Chuvash"),
new LanguageCode("cor","kw","Cornish"),
new LanguageCode("cos","co","Corsican"),
new LanguageCode("cre","cr","Cree"),
new LanguageCode("cze","cs","Czech"),
new LanguageCode("dan","da","Danish"),
new LanguageCode("div","dv","Divehi; Dhivehi; Maldivian"),
new LanguageCode("dut","nl","Dutch; Flemish"),
new LanguageCode("dzo","dz","Dzongkha"),
new LanguageCode("eng","en","English"),
new LanguageCode("epo","eo","Esperanto"),
new LanguageCode("est","et","Estonian"),
new LanguageCode("ewe","ee","Ewe"),
new LanguageCode("fao","fo","Faroese"),
new LanguageCode("fij","fj","Fijian"),
new LanguageCode("fin","fi","Finnish"),
new LanguageCode("fre","fr","French"),
new LanguageCode("fry","fy","Western Frisian"),
new LanguageCode("ful","ff","Fulah"),
new LanguageCode("geo","ka","Georgian"),
new LanguageCode("ger","de","German"),
new LanguageCode("gla","gd","Gaelic; Scottish Gaelic"),
new LanguageCode("gle","ga","Irish"),
new LanguageCode("glg","gl","Galician"),
new LanguageCode("glv","gv","Manx"),
new LanguageCode("gre","el","Greek Modern (1453-)"),
new LanguageCode("grn","gn","Guarani"),
new LanguageCode("guj","gu","Gujarati"),
new LanguageCode("hat","ht","Haitian; Haitian Creole"),
new LanguageCode("hau","ha","Hausa"),
new LanguageCode("heb","he","Hebrew"),
new LanguageCode("her","hz","Herero"),
new LanguageCode("hin","hi","Hindi"),
new LanguageCode("hmo","ho","Hiri Motu"),
new LanguageCode("hrv","hr","Croatian"),
new LanguageCode("hun","hu","Hungarian"),
new LanguageCode("ibo","ig","Igbo"),
new LanguageCode("ice","is","Icelandic"),
new LanguageCode("ido","io","Ido"),
new LanguageCode("iii","ii","Sichuan Yi; Nuosu"),
new LanguageCode("iku","iu","Inuktitut"),
new LanguageCode("ile","ie","Interlingue; Occidental"),
new LanguageCode("ina","ia","Interlingua (International Auxiliary Language Association)"),
new LanguageCode("ind","id","Indonesian"),
new LanguageCode("ipk","ik","Inupiaq"),
new LanguageCode("ita","it","Italian"),
new LanguageCode("jav","jv","Javanese"),
new LanguageCode("jpn","ja","Japanese"),
new LanguageCode("kal","kl","Kalaallisut; Greenlandic"),
new LanguageCode("kan","kn","Kannada"),
new LanguageCode("kas","ks","Kashmiri"),
new LanguageCode("kau","kr","Kanuri"),
new LanguageCode("kaz","kk","Kazakh"),
new LanguageCode("khm","km","Central Khmer"),
new LanguageCode("kik","ki","Kikuyu; Gikuyu"),
new LanguageCode("kin","rw","Kinyarwanda"),
new LanguageCode("kir","ky","Kirghiz; Kyrgyz"),
new LanguageCode("kom","kv","Komi"),
new LanguageCode("kon","kg","Kongo"),
new LanguageCode("kor","ko","Korean"),
new LanguageCode("kua","kj","Kuanyama; Kwanyama"),
new LanguageCode("kur","ku","Kurdish"),
new LanguageCode("lao","lo","Lao"),
new LanguageCode("lat","la","Latin"),
new LanguageCode("lav","lv","Latvian"),
new LanguageCode("lim","li","Limburgan; Limburger; Limburgish"),
new LanguageCode("lin","ln","Lingala"),
new LanguageCode("lit","lt","Lithuanian"),
new LanguageCode("ltz","lb","Luxembourgish; Letzeburgesch"),
new LanguageCode("lub","lu","Luba-Katanga"),
new LanguageCode("lug","lg","Ganda"),
new LanguageCode("mac","mk","Macedonian"),
new LanguageCode("mah","mh","Marshallese"),
new LanguageCode("mal","ml","Malayalam"),
new LanguageCode("mao","mi","Maori"),
new LanguageCode("mar","mr","Marathi"),
new LanguageCode("may","ms","Malay"),
new LanguageCode("mlg","mg","Malagasy"),
new LanguageCode("mlt","mt","Maltese"),
new LanguageCode("mon","mn","Mongolian"),
new LanguageCode("nau","na","Nauru"),
new LanguageCode("nav","nv","Navajo; Navaho"),
new LanguageCode("nbl","nr","Ndebele South; South Ndebele"),
new LanguageCode("nde", "nd", "Ndebele North; North Ndebele"),
new LanguageCode("ndo", "ng", "Ndonga"),
new LanguageCode("nep", "ne", "Nepali"),
new LanguageCode("nno", "nn", "Norwegian Nynorsk; Nynorsk Norwegian"),
new LanguageCode("nob", "nb", "Bokmål Norwegian; Norwegian Bokmål"),
new LanguageCode("nor", "no", "Norwegian"),
new LanguageCode("nya", "ny", "Chichewa; Chewa; Nyanja"),
new LanguageCode("oci", "oc", "Occitan (post 1500); Provençal"),
new LanguageCode("oji", "oj", "Ojibwa"),
new LanguageCode("ori", "or", "Oriya"),
new LanguageCode("orm", "om", "Oromo"),
new LanguageCode("oss", "os", "Ossetian; Ossetic"),
new LanguageCode("pan", "pa", "Panjabi; Punjabi"),
new LanguageCode("per", "fa", "Persian"),
new LanguageCode("pli", "pi", "Pali"),
new LanguageCode("pol", "pl", "Polish"),
new LanguageCode("por", "pt", "Portuguese"),
new LanguageCode("pus", "ps", "Pushto; Pashto"),
new LanguageCode("que", "qu", "Quechua"),
new LanguageCode("roh", "rm", "Romansh"),
new LanguageCode("rum", "ro", "Romanian; Moldavian; Moldovan"),
new LanguageCode("run", "rn", "Rundi"),
new LanguageCode("rus", "ru", "Russian"),
new LanguageCode("sag", "sg", "Sango"),
new LanguageCode("san", "sa", "Sanskrit"),
new LanguageCode("sin", "si", "Sinhala; Sinhalese"),
new LanguageCode("slo", "sk", "Slovak"),
new LanguageCode("slv", "sl", "Slovenian"),
new LanguageCode("sme", "se", "Northern Sami"),
new LanguageCode("smo", "sm", "Samoan"),
new LanguageCode("sna", "sn", "Shona"),
new LanguageCode("snd", "sd", "Sindhi"),
new LanguageCode("som", "so", "Somali"),
new LanguageCode("sot", "st", "Sotho Southern"),
new LanguageCode("spa", "es", "Spanish; Castilian"),
new LanguageCode("srd", "sc", "Sardinian"),
new LanguageCode("srp", "sr", "Serbian"),
new LanguageCode("ssw", "ss", "Swati"),
new LanguageCode("sun", "su", "Sundanese"),
new LanguageCode("swa", "sw", "Swahili"),
new LanguageCode("swe", "sv", "Swedish"),
new LanguageCode("tah", "ty", "Tahitian"),
new LanguageCode("tam", "ta", "Tamil"),
new LanguageCode("tat", "tt", "Tatar"),
new LanguageCode("tel", "te", "Telugu"),
new LanguageCode("tgk", "tg", "Tajik"),
new LanguageCode("tgl", "tl", "Tagalog"),
new LanguageCode("tha", "th", "Thai"),
new LanguageCode("tib", "bo", "Tibetan"),
new LanguageCode("tir", "ti", "Tigrinya"),
new LanguageCode("ton", "to", "Tonga (Tonga Islands)"),
new LanguageCode("tsn", "tn", "Tswana"),
new LanguageCode("tso", "ts", "Tsonga"),
new LanguageCode("tuk", "tk", "Turkmen"),
new LanguageCode("tur", "tr", "Turkish"),
new LanguageCode("twi", "tw", "Twi"),
new LanguageCode("uig", "ug", "Uighur; Uyghur"),
new LanguageCode("ukr", "uk", "Ukrainian"),
new LanguageCode("urd", "ur", "Urdu"),
new LanguageCode("uzb", "uz", "Uzbek"),
new LanguageCode("ven", "ve", "Venda"),
new LanguageCode("vie", "vi", "Vietnamese"),
new LanguageCode("vol", "vo", "Volapük"),
new LanguageCode("wel", "cy", "Welsh"),
new LanguageCode("wln", "wa", "Walloon"),
new LanguageCode("wol", "wo", "Wolof"),
new LanguageCode("xho", "xh", "Xhosa"),
new LanguageCode("yid", "yi", "Yiddish"),
new LanguageCode("yor", "yo", "Yoruba"),
new LanguageCode("zha", "za", "Zhuang; Chuang"),
new LanguageCode("zul", "zu", "Zulu"),
}.OrderBy(x => x.LanguageName).ToList();
}
public static IList<LanguageCode> Codes
{
get
{
return _codes.AsReadOnly();
}
}
public static int GetDefaultLanguageIndex()
{
var i = 43;
var currentCulture = CultureInfo.CurrentCulture.TwoLetterISOLanguageName;
var matched = Codes.FirstOrDefault(x => x.TwoLetterIsoCode == currentCulture);
if (matched != null)
{
return Codes.IndexOf(matched);
}
return i;
}
}
public class LanguageCode
{
public string TwoLetterIsoCode
{
get;
private set;
}
public string ThreeLetterIsoCode
{
get;
private set;
}
public string LanguageName
{
get;
private set;
}
public LanguageCode(string threeCode, string code, string name)
{
this.ThreeLetterIsoCode = threeCode;
this.TwoLetterIsoCode = code;
this.LanguageName = name;
}
public override string ToString()
{
return $"{LanguageName} - ({ThreeLetterIsoCode})";
}
}
....C# version. I guess we can translate it to C++/CX if need be.
@lukasf I get a
Microsoft C++ exception: Platform::InvalidArgumentException ^ at memory location 0x1E1BE630. HRESULT:0x80070057 The parameter is incorrect. WinRT information: The parameter is incorrect.
When trying to play this video, at around 3/4 of its length, it complains about something related to the text cue. Do you get the same?
Actually, I am not getting it every time, just as the first cue is added. Strange.
I had the same problem. When I switched back to the region+style of your original commit, the problem did not occur. I modified yours to fix the positioning. It is working for me now, not sure though what the real problem was. Maybe it needs the explicit definition of padding and whatever, even if it now has zero values. Man, this is a buggy piece of ***. I guess we are the first ones to ever use it. Does not have a single google result except docs.
We need to find a good default style. Subs without outlining do not work with light background, but the current black outline also does not look good. Maybe some semi-transparent gray outline? Need to play with this more...
It was because of the
TimedTextDouble outlineThickness;
outlineThickness.Unit = TimedTextUnit::Percentage;
outlineThickness.Value = 5;
CueStyle->OutlineThickness = outlineThickness;
wouldn't a white outline do it?
Are you sure? I removed it and it continued to work. I put it in again for readability.
We are also losing the first caption "Are you hurt?" in that video.
Ahh nvm. It was just the logic of my app messing it up.
For me it works reliable now, including first caption.
@lukasf Let's discuss about embedded CC support here.
1) At first glance it looks that indeed TimedMetadataStreamDescriptor looks a little bleak. Maybe there is still a SDK update to come.
2) The pull model can still work, probably we need to provide mock frame when there are no subtitles available.
3) As for a push model, maybe we can extract timed text from the file and feed it into existing, working API, such as TimedMetadataTrack.