Open lighterowl opened 1 year ago
It was clearly a quick implementation, without the support of everything. Currently not a priority for us but could be prioritized on request. Would you mind to share some sample files demonstrating this issue with MediaInfo?
Sure, here you go (gzipped so github will accept) : tvp_rozrywka.ts.gz
When running this file with MediaInfo, the service information is incorrect w.r.t. some characters :
Menu
ID : 501 (0x1F5)
Menu ID : 62 (0x3E)
Format : HEVC / E-AC-3 / DVB Subtitle / E-AC-3 /
Duration : 15 s 344 ms
List : 502 (0x1F6) (HEVC) / 503 (0x1F7) (E-AC-3, Polish) / 506 (0x1FA) (DVB Subtitle) / 508 (0x1FC) (E-AC-3, aux) / 8005 (0x1F45) ()
Language : / Polish / / aux
Service name : TVP Rozrywka
Service provider : Emitel
Service type : reserved for future use
UTC 2023-02-22 21:10:00 : pl:Wojciech Cejrowski- boso przez úwiat - (68) Wenezuela - Boso ale w ostrogach / pl: / foreign countries/expeditions / / 00:35:00 /
UTC 2023-02-22 21:45:00 : pl:Rolnik szuka ýony seria 9 - /9/ / pl: / social/spiritual sciences / / 01:00:00 /
UTC 2023-02-22 22:45:00 : pl:Szansa na sukces. Opole 2023 - odc. (8) Piotr Cugowski / pl: / music/ballet/dance / / 01:10:00 /
UTC 2023-02-22 23:55:00 : pl:Koùo fortuny - odc. 1441 ed. 12 / pl: / game show/quiz/contest / / 00:40:00 /
UTC 2023-02-26 03:05:00 : pl:Ýycie to Kabaret - Kabaretomaniacy - (1) / pl: / variety show / / 00:50:00 /
UTC 2023-02-26 03:55:00 : pl:Zakoñczenie dnia / pl: / undefined / / 01:40:00 /
UTC 2023-02-26 05:35:00 : pl:Okrasa ùamie przepisy - Lekko i dietetycznie z królikiem / pl: / cooking / / 00:35:00 /
For example, the last event, Okrasa ùamie przepisy
, should be Okrasa łamie przepisy
. The descriptor for this particular event starts at offset 0x6E1069
into the file :
$ xxd -s 0x6E1069 -l 10 tvp_rozrywka.ts
006e1069: 4d3e 706f 6c39 094f 6b72 M>pol9.Okr
The bytes are, in order :
4d
is the identifier of a short_event_descriptor
,3e
is the descriptor's length, 62 bytes,706f6c
is the ISO 639 language code : pol
,39
is the length of the following event_name
,09
tells us that the following bytes are encoded as ISO 8859-13,
The implementation of
File_Mpeg_Descriptors::Get_DVB_Text
, which is the central point for converting a "DVB string" representation to the internalZtring
, only supports ISO-8859-2 and callsGet_Local
(which ends up usingCP_ACP
on Windows and ISO-8859-1 on other systems) for all other combinations of the bytes used for representing the used encoding.Furthermore, the function processes the buffer with
Get_Local
if the first byte is larger or equal to 0x20, indicating that the "default encoding" should be used.Get_Local
, as already noted, uses eitherCP_ACP
or ISO-8859-1. This also incorrect, as the "default encoding" for DVB strings is IEC 6937 with the euro sign (0x20AC
) instead of$
at position0xA4
.The current mapping is described in DVB BlueBook A038r15 :