Closed brabebhin closed 6 years ago
I just added automatic enable/disable of streams based on subtitle selection, so we only process those subs that are actually shown.
Added language tag conversion, moved subtitle stuff from StreamInfo back into SampleProvider. Added auto selection of forced subtitles.
One thing that I have noticed is a short flicker in subtitles. I assume it occurs when we add a subtitle cue while a different subtitle cue is currently shown. Do you see this as well? This one could be hard to fix.
We could try to wait for CueExited event, to add cues only when nothing is shown. But this makes things difficult due to multi threading. Plus there can be cases where subtitle cues overlap.
I did notice a slight sluggish performance, but i couldn't tell why. It is surely linked with the text cues though. I will have to try my app in release mode as well to see if things escalate.
I think we should keep the 3 letter code as well, in case we ever need to use it.
I wonder if the language would ever be in its 2 letter name...
I managed to fix the flicker issue, it was not that hard in the end. It is really gone when we wait for current cue to be hidden. I do not see any performance issues, no matter if subs are shown or not.
If you want, you can add a "LanguageTag" string to the stream info classes, which contains the original, unmodified language tag from the stream. I would not name it three letter code because we cannot be sure it is always three letters (although it probably will be in 99% of files). What might be more useful is perhaps the parsed two letter code, because that is what Windows/UWP uses for all language stuff. So if you want to auto select streams based on user's windows settings, two letter code is what you need.
auto selection on basis of users windows settings seems a very powerful tool, but we still must have an option to select a custom language, just in case?
@lukasf , nice work. :)
@touseefbsb , TimedMetadataTrack selection is done externally, either programatically, or via the transport controls. This is not the job of the library.
@mcosmin222 what I wanted to ask actually is, after adding the auto selection into the library, ( supposing we r still using default mediaplayerelement ) will we still be able to change the subs language i.e TimeMetadataTrack externally with transport controls/programatically.?
Basically how this all will work together is that library will select a default language according to device settings and then user can change the language subs just like in a normal MediaPlayerElement ( using MediaPlaybackItem.TimedMetadataTrack) , is this how it will work?
The mandatory CC selection button will allow users to select the desired timed text track.
thanks, that answers my question :)
You can plug this library into the default mediaplayerelement and you should be good to go.
@touseefbsb Just to clarify: I don't think that we will implement auto track selection based on user language in the lib. The only thing I added was to auto select "Forced" subtitles, if there are any. These are specifically flagged subtitles that are intended to be always shown, usually to translate foreign language parts in a movie. Any advanced auto selection of normal subtitles must be done in the app.
You can enable and disable individual tracks in code on the MediaPlaybackItem using this method:
mediaPlaybackItem.TimedMetadataTracks.SetPresentationMode(
@lukasf ahan so those forced subtitles are usually those which we normally cannot control with the transport controls and we cant add them with any srt stuff, they are just right there like they are part of video right? and the mediaPlaybackItem.TimedMetadataTracks.SetPresentationMode(, ) deals with embeeded CC subs ? so these are 2 different kind of subs independent of each other?
and then there is 3rd type of subs known as External subs like files with .srt extensions, if I am correct then these external subs are controled with TimedTextSource , so are these external subtitles effected with the use of this library? or they still remain independent? the default behaviour of Movies and TV gives us a button to toggle between embeded cc ( in any ) and also gives us the option to select an external file ( within same button ) and if we select an external file the embeded cc ae overridden witht he external subs ( .srt ) so how will these work together if we use this library? will it work exactly the same way?
No, forced subtitles can be enabled/disabled just like any other subtitle, they also appear in the transport controls. On the original DVD/BluRay, you usually cannot disable them through the menu, they are always shown. But for us, they are just a normal subtitle stream, but with a special flag. I use the flag to pre-select it, to kind of emulate what the BluRay does.
If you have an external subtitle file for a movie (e.g. a .srt), you can add that to TimedTextSource. Then it will appear in the transport controls like any other subtitle stream. It will also show up in TimedMetadataTracks (you need to listen to the changed event, they are added a bit delayed) and can be enabled/disabled like others. For the user it will look just like an embedded subtitle.
The default transport controls do not support the selection of external subtitle files, so you have to implement that button yourself.
so at the end , there is no difference between
whatever the scenario, only one of them will appear on the video at a given time ( depending upon the selection ) ? Ok so I can implement a button with TimedTextSource which can get an external subs file and add it to collection of TimedMetadataTracks, and then I can toggle between all the subs, with the built in subs toggle button?
the only confusion I have right now is what is the different between embeded subs and forced subs? I get that forced subs have flag to show them. if I understand correctly, then a video file can have multple embeded subs when we play it and some of them maybe forced and some might be normal, so if this library finds a forced sub it pre-selects it and shows it in the video? and then when the user presses the subs toggle button they see the list of the embeded subs ( with the forced sub already checked/selected ) and then they can change the subs as they like? but if they dont find the subs they want in that list, they can select an external srt file ( which will be implemented by the developer ) and then add that external sub will be added into the list of possible subs and can be selected from the same toggle list? Let me know if I got it wrong anywhere.
This is correct. The only difference between forced subs and normal subs is that forced subs are now pre-selected by the lib. If there are no forced subs, then no subtitles are pre-selected. The user can always change subtitle selection, and they can also disable forced subs if they don't like them.
If you do not like pre-selection of forced subs, you can also switch this off in our config class. Then there is no difference at all between normal and forced subtitles. But I think it is a good idea to pre-select them.
@lukasf Thanks that clarifies it and yeah I agree pre selection of forces subs seems like a good idea to me as well
@lukasf , I think it is easier to consume the API if the SubtitleStreamInfo class exposes a TimedMetadataTrack, especially if you plan on combining internal CC and external CC.
Ahh, I see that is quite difficult... But do we really need to create a new playback item each time? The way I see it the various media sources will all use the same base media stream sources, it is kind of redundant to do this. I think it should return the same playback item each time.
For example, if you want to separate the forced timed metadata tracks from the others, you have to carry around both the ffmpeg interop object and the media playback item.
@mcosmin222 I did not yet get your point. If you want to add external subtitles, you have to add them through MediaPlaybackItem. And you can access all subtitle streams on the MediaPlaybackItem as well (through TimedMetadataTracks property). Now why do we need the TimedMetadataTrack on the SubtitleStreamInfo? I mean, feel free to add it there if you think it helps, I just did not understand your argument.
Currently I always create new instances of MediaPlaybackItem, because there are overloads which add start/end time. So if we'd call the method twice with different paramters, it would be wrong to return the same instance. One idea to solve this: Add a MediaPlaybackItem property on the FFmpegInterop class. Calling one of the CreateMediaPlaybackItem methods would fill that property, so after that you can access it through the FFmpegInterop instance. Calling the Create method again will throw an exception.
We could also add start/end time as optional parameters to CreateFromStream method, and use them when building the MediaPlaybackitem. But I do not like that so much, because the start/end time would only be effective when using MediaPlaybackItem, but not when using MediaStreamSource.
In my use case, I have custom media transport controls, which allows me to implement custom selection UI for the CC. Now, I want to give the user different options based on weather a TimedMetadataTrack is "forced" or not. In the current setup, through the MediaPlaybackItem, there is no way to tell which track is forced and which is not. In order to resolve the problem, you have to pair the MediaPlaybackItem to the FFmpegInteropMSS object, and you can figure out which TimedMetadataTrack is forced because it will have the same index as with the corresponding SubtitleStreamInfo.
i see your point about having to move the MediaPlaybackItem start time and duration in the ffmpeg interop config file. However, neither the current approach is good. If I create 2 different MediaSources from the same MediaStreamSource, I am not even sure what will happen, probably lead to some weird conflicts.
There should be a 1 to 1 link between the ffmpegInteropMSS, its MediaStreamSource and MediaPlaybackItem.
One idea to solve this: Add a MediaPlaybackItem property on the FFmpegInterop class. Calling one of the CreateMediaPlaybackItem methods would fill that property, so after that you can access it through the FFmpegInterop instance. Calling the Create method again will throw an exception.
This is a good idea. Sounds good.
I agree with this idea as well, seems very generic and easy to use for all use cases.
One idea to solve this: Add a MediaPlaybackItem property on the FFmpegInterop class. Calling one of the CreateMediaPlaybackItem methods would fill that property, so after that you can access it through the FFmpegInterop instance. Calling the Create method again will throw an exception.
Okay, looks like we have an agreement on the Create() handling.
@mcosmin222 That makes sense, feel free to add TimedMetadataTrack to SubtitleStreamInfo.
@lukasf OK, I will do the "Create()" handling while I am at it.
@lukasf done.
@mcosmin222 Thanks!
OK. This looks pretty stable now. Anyone found any bugs yet?
Maybe we can move onto picture subs ^^
@mcosmin222 picture subs?
Yes.
actually I meant to ask, what are picture subs? :P @mcosmin222
That's sub/idx, a bitmap based subtitle format, which is used on DVD/BluRay.
oh so basically instead of text it will be image based, but only text on the image will be visible with transparent background hence it will appear like normal text?
Yes. They usually look pretty ugly, because they only use a 4 bit palette, and they do not scale well. Still, it is one of the three big subtitle formats, and I'd see it as mandatory for a full-featured player.
I wonder if we could convert them to text based subtitles, using windows OCR API.
wouldnt ocr api be expensive ?
That's why I wonder ^^
If the OCR goes wrong, you can make things even worse. I have seen quite a lot of bad OCR'd subtitles, where some of the characters always get interpreted as a diffrent character. I don't know how this would be performance wise. I mean, we could try and provide this as an option, if performance is good. But it should always be possible to render the original, non-ocr'd subs.
Thing is, windows OCR is probably optimized for normal text (black or dark color on white). I really don't know how well it would work on yellow/white text on transparent (or black) background. Subtitle creation programs use tweaked OCR and still they are not perfect.
I played around with OCR and it is unreliable. So I guess we are to move ahead with simply decoding the pictures to RBG.
Since there are only 4 bit pallette, perhaps we can optimize this and have a table map each possible color to a RBG color to improve performance.
Sure, a lookup table is the best option.
I think we should merge the PR before working on bitmap subs. I fixed a NullReferenceException a few days back, but besides that I did not notice any other problems. Any other findings or are we good to merge?
We are good to merge. I haven't been able to do anything regarding bitmap subs due to various other non-related issues lately.
Merged the current PR. I've also been rather busy lately. Maybe I will have some time next week for bitmap subs, let's see...
I think there is a timing issue with the anti flickering solution. From what I can tell, we can miss some cues if the active cue overlaps their duration. For example, if 2 characters talk at the same time, and each of them is a different cue, we will miss the 2nd cue.
I wonder if it were possible to decode only the cue streams in advance, maybe on different thread, using a different ffmpeg reader+IRandomAccessStream?
Have you actually observed this? Because the anti flicker code is supposed to directly add a cue if it overlaps with the active cue. In that case, a flicker can occur, but the cue is shown in time. Only if the cues do not overlap, then new cue is added to the pending list and shown once the active cue is hidden.
If cues are really lost, we have to investigate why the anti flicker solution does not correctly detect the overlap.
Some cues do seem to get lost. But i only observed that in one file so far. I was wondering if you noticed something similar. I haven't thought of a way to confirm that yet though.
Ahh i see now. There is a race between the mutex lock in addCue and the mutex lock in CueExit event handler. There is a chance that addCue wins the race before the CueExit event handler, thus causing us to miss cues. I think there should be a priority lock acquisition for the event handler to add the pending cues if there are pending cues.
@lukasf Let's discuss about embedded CC support here.
1) At first glance it looks that indeed TimedMetadataStreamDescriptor looks a little bleak. Maybe there is still a SDK update to come.
2) The pull model can still work, probably we need to provide mock frame when there are no subtitles available.
3) As for a push model, maybe we can extract timed text from the file and feed it into existing, working API, such as TimedMetadataTrack.