Closed dagwieers closed 4 years ago
Looking into Kodi and inputstream.adaptive, I think the problem is with the conversion to TTML (https://github.com/peak3d/inputstream.adaptive/blob/master/src/parser/TTML.cpp#L87) and SRT (https://github.com/peak3d/inputstream.adaptive/blob/master/src/parser/TTML.cpp#L262).
Without actual debugging (can't find an easy way to dump the files) I think it is adding both \r\n
and additionally an \n
.
I noticed that this problem only occurs with the on-demand streams, the livestreams don't show the extra empty line. So, I thought there might something wrong with the TTML-subtitles in the on-demand streams from VRT.
To investigate, I extracted the TTML-stream from the last episode of "De Ideale Wereld" with ffmpeg:
ffmpeg -i "https://remix-vrt.akamaized.net/remix/f1d863ef-4a58-464a-ab80-c6ab2c044ac1/remix.ism/.mpd" -map 0:7 -c:d copy -copy_unknown -f data deidealewereld.ttml
I did the same for the livestream of channel "Eén" while "FC De Kampioenen" was on air.
ffmpeg -i "https://live-vrt.akamaized.net/groupc/live/8edf3bdf-7db3-41c3-a318-72cb7f82de66/live.isml/.mpd" -map 0:8 -c:d copy -copy_unknown -f data live.ttml
Then I compared the TTML-subtitles from "De Ideale Wereld" and the "Eén"-livestream.
De Ideale Wereld:
<p begin="00:06:55.268" end="00:06:58.148" region="speaker" xml:id="s108"><span style="textStyle">Daar heb je niks aan,<br></br>
aan valse Hollanders.</span></p>
Eén Live:
<p begin="429307:51:09.720" end="429307:51:11.120" region="speaker">Niet overdrijven. Er zijn <br></br>geen cowboys en indianen meer.</p>
I determined that in "De Ideale Wereld"-stream there is an (invisible) newline character after every <br></br>
line break. In the livestream, there is no newline character.
The inputstream.adaptive TTML-parser treats the newline character inside a <span></span>
tag as a newline in the subtitle text and that causes the extra empty line between the subtitles.
Since the <span></span>
tag is an explicit inline-element in all HTML-standards, I think the inputstream.adaptive TTML-parser should definitely ignore these newline characters inside <span></span>
tags.
I fixed this bug already: https://github.com/mediaminister/inputstream.adaptive/commit/be10ef17b9c44c99452d3052098e55c63a0eccf5
And I fixed two other bugs in the inputstream.adaptive TTML parser: 1) No subtitles on Ketnet livestream cause fractional seconds with 3 digits were not supported correctly 2) Fixed typo to support font colors.
fixed in inputstream.adaptive by @mediaminister
@pietje666 I understand the wish to close issues (especially if they are not in this project), but closing them hides them from existing users. And the issue is not fixed upstream, nor is there an easy fix available from anyone so people will experience this issue in the field.
So maybe it's better to keep it open until a fix is available to users from default repositories, and maybe label these issues as fix_upstream or waiting_on_upstream.
The problem i see in this approach is that there will be alot of open issues, and we might loose track of the real issues. So personally i do not like it :), and people can still view the closed issues and comments.
So it appears that the VRT is following the official standard very closely: https://www.w3.org/TR/ttml1/#ttml-example-body
Strange they insist on doing <br></br>
instead of simply using <br/>
.
Let us hope this is closed upstream real soon now.
Fixed, wh00p wh00p ! I think we can close this one now ;-)
The fix is still not released :-(
Not for ARM. I noticed the newer 18.1 builds also shipped with newer inputstream.adaptive, so I guess we need to wait for newer LibeELEC snapshot releases...
Yesterday, 2019-05-02, inpustream.adaptive-2.3.17.1 was released for Kodi Leia (on 18.1) which fixes this issue finally for me. Thanks everyone, especially @mediaminister :-)
I think VRT NU changed something to MPEG-DASH TTML subtitling last week and for some programs the extra empty line is back again. Can someone comfirm this?
I noticed this as well yesterday. I assumed recent inputstream.adaptive update had a regression, but first wanted to investigate the cause before reporting.
Okay, I found the cause already. VRT NU is now using multiple span tags and separate subtitle colours for different characters in a soap like Thuis.
OLD VRT NU TTML (Thuis, June 20)
<p begin="00:04:37.674" end="00:04:40.954" region="speaker" xml:id="s79"><span style="textStyle">Amai. Mooie zwembroek.<br></br>
- Ah, ja. Ja, merci.</span></p><p begin="00:04:41.034" end="00:04:43.994" region="speaker" xml:id="s80"><span style="textStyle">Mannekes, gaan wij zien?<br></br>
Bob is aan het wachten.</span></p>
NEW VRT NU TTML (Thuis, June 21)
<p begin="00:07:33.333" end="00:07:35.773" region="region-11" tts:textAlign="center">
<span style="singleHeightStyle" xml:space="preserve" tts:color="cyan" tts:backgroundColor="black">Eddy. Kom eens.</span>
</p><p begin="00:07:40.733" end="00:07:42.213" region="region-11" tts:textAlign="center">
<span style="singleHeightStyle" xml:space="preserve" tts:backgroundColor="black">Wat heb ik nu weer misdaan?</span>
</p>
InputStream Adaptive doesn't process these <span>
tags right. Multiple span tags inside a <p>
tag should be treated as a single line subtitle, only a <br>
tag can break a subtitle line. I think InputStream Adaptive breaks these <span>
tags into multiple lines.
I will fix this.
I guess we are blessed with VRT being on the forefront of subtitle technology ;-)
The bar can always be set higher! No subtitles today for Thuis on VRT NU: https://www.vrt.be/vrtnu/a-z/thuis/24/thuis-s24a4663/
I have a fix. Can you test this? Zips: (install Kodi add-on from zip file) Linux x64: inputstream.adaptive-2.3.22_linux_x64.zip Windows x64: inputstream.adaptive-2.3.22_winx64.zip
Source: https://github.com/mediaminister/inputstream.adaptive/tree/multispan
Compiling instructions for Linux:
git clone https://github.com/xbmc/xbmc.git
git clone https://github.com/mediaminister/inputstream.adaptive/tree/multispan
cd ~/inputstream.adaptive && mkdir build && cd build
cmake -DADDONS_TO_BUILD=inputstream.adaptive -DADDON_SRC_PREFIX=../.. -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../../xbmc/addons -DPACKAGE_ZIP=1 ../../xbmc/cmake/addons
make
cd ~/xbmc/addons/
zip -r inputstream.adaptive-2.3.22.zip inputstream.adaptive
I'm afraid I won't be able to test this easily. I would need to cross-compile to ARM to make that work, and that's not something I would like to spend my time on. And I can't easily run things on my dated laptop either. That's why it would have been nice if I could run LibreELEC on VirtualBox or KVM. I don't understand why they don't invest in making that possible, rather than supporting only VMware.
(I may be able to test this evening on Windows though)
No problem, I tested this already on Windows and Linux and it seems to work completely as expected. I just have to test if it doesn't break other TTML streams like Netflix or HBO. Then I will create a pull request on the official inputstream.adaptive repo.
Added a pull request to the inputstream.adaptive repo: https://github.com/peak3d/inputstream.adaptive/pull/284
This is now fixed in the just released inputstream.adaptive v2.4.0. Thanks to @mediaminister and @peak3d !
It appears inputstream.adaptive v2.4.0 is not available for Raspberry Pi (or ARM in general).
With LibreELEC v9.1.501 (v9.2 Beta 1) on Raspberry Pi using Kodi v18.4 and inputstream.adaptive v2.4.2.1 this issue is still present :-(
When watching Terzake live on Canvas, I also noticed that the subtitles appeared later, and there was a large gap between the next subtitle. I then compared to the online player, and there this doesn't happen. Subtitles quickly follow up, with less pause, and they appear longer on screen. Much more in sync with the spoken words.
And I also noticed that the interview of Bart De Wever appeared in blue, whereas the interviewer appears as white. I was surprised that this wasn't supported, as I have seen Kodi do this for other subtitles as well, but never saw this on VRT NU.
With LibreELEC v9.1.501 (v9.2 Beta 1) on Raspberry Pi using Kodi v18.4 and inputstream.adaptive v2.4.2.1 this issue is still present :-(
That's right, unfortunately, my multispan commit is not merged in the Leia branch, it's only merged in the Matrix branch: https://github.com/peak3d/inputstream.adaptive/commit/f90cab170e32476f8b70c284e2134b18b89bcdbb
When watching Terzake live on Canvas, I also noticed that the subtitles appeared later, and there was a large gap between the next subtitle. I then compared to the online player, and there this doesn't happen.
I'll investigate this, but for current affair programs, only subtitles that are prepared in advance of the live broadcast will be in sync. Subtitles for live interviews or live reports are created live and will appear a couple of seconds later.
Differences between the VRT NU web player and Kodi should not occur provided that you are approximately simultaneously watching in Kodi and the VRT NU web player. If you watch a current affair program at a later time than it's possible you get a different on demand stream. I'll check the subtitle timings for the live streams.
And I also noticed that the interview of Bart De Wever appeared in blue, whereas the interviewer appears as white. I was surprised that this wasn't supported, as I have seen Kodi do this for other subtitles as well, but never saw this on VRT NU.
Subtitle colors are not yet implemented in InputStream Adaptive's TTML parser. I can make a pull request for this.
@mediaminister colors are implemented in TTML, but maybe not in the format used in @dagwieers sample. https://github.com/peak3d/inputstream.adaptive/blob/master/src/parser/TTML.cpp#L47
@peak3d You're right, it's implemented in the <style>
tag, but not yet in the <span>
tag: https://www.w3.org/TR/2018/REC-ttml1-20181108/#style-attribute-color
I'm currently testing an experimental fix for this: https://github.com/mediaminister/inputstream.adaptive/tree/ttmlcolor
I'll investigate this, but for current affair programs, only subtitles that are prepared in advance of the live broadcast will be in sync. Subtitles for live interviews or live reports are created live and will appear a couple of seconds later.
Well, in this case it appeared longer and with shorter delays between subtitles when watching online on vrtnu.be. So there must be something wrong with the timings as subtitles seem to have a delayed start in Kodi.
Differences between the VRT NU web player and Kodi should not occur provided that you are approximately simultaneously watching in Kodi and the VRT NU web player. If you watch a current affair program at a later time than it's possible you get a different on demand stream. I'll check the subtitle timings for the live streams.
I was watching simultaneously. And to ensure I was seeing this correctly, paused one of the streams so that both streams would be synchronous.
Well, in this case it appeared longer and with shorter delays between subtitles when watching online on vrtnu.be. So there must be something wrong with the timings as subtitles seem to have a delayed start in Kodi.
I did a test with the livestreams of Eén en Canvas at 8 p.m. I suspect there is something wrong with the timings of the livestreams from VRT NU, but I didn't see a difference between VRT NU live webplayer(https://www.vrt.be/vrtnu/livestream/) and Kodi VRT NU Live TV on a Linux pc. So, I think it's a problem with the source.
I didn't see this problem with on demand streams like Thuis, subtitling is perfectly in sync.
You can easily distinguish live and on demand subtitle streams in VRT NU webplayer: On demand subtitles have colors and a black background, livestream subtitles are white with a thin black border.
@mediaminister I will check next time I see this happening.
Also, In case you need more examples using colours, I noticed that De Ideale Wereld (met Stef Kamil Carlens) on-demand also has coloured subtitles online.
Colors is implemented in https://github.com/mediaminister/inputstream.adaptive/commit/fca16d9f1d9c63d6df05b0de202f74029e2f7cfe and works flawlessly now, but I still need to improve the C++ code and test if this doesn't break other TTML subtitle streams.
The Raspberry Pi build of inputstream.adaptive from @peno64 (at https://github.com/michaelarnauts/plugin.video.vtm.go/issues/1#issuecomment-536220725) seems to have fixed the subtitles-newlines.
Obviously @peno64 compiled the source from master branch. Still not fixed in the Leia branch.
@mediaminister I took a branch as follows: git clone https://github.com/peak3d/inputstream.adaptive Did this a day or two ago
@dagwieers I think this can be closed because InputStream Adaptive 2.4.3 is released.
Indeed, thanks everyone, especially @mediaminister !
Describe the bug The subtitles show an extra empty line between subtitles. ~Most likely this is an issue with DOS (
\r\n
) vs Unix text-files (\n
).~ (And this issue is likely in Kodi's subtitle handling, not in the addon)(please complete the following information):