globocom / m3u8

Python m3u8 Parser for HTTP Live Streaming (HLS) Transmissions
Other
1.98k stars 464 forks source link

IPTV example for custom tags has problems woth properties that might contain comma #353

Open takeda opened 6 months ago

takeda commented 6 months ago

The issue https://github.com/globocom/m3u8/issues/206 talked about parsing iptv properties, a solution was presented with a custom parser.

Though, because how the properties are parsed, if the property contains a comma (most commonly it happens with tvg-name) it will result in incorrectly parsed information.

I believe the code below parses information correctly:

import re
from typing import Final, Any

import m3u8
from m3u8 import protocol
from m3u8.parser import save_segment_custom_value

EXTINF_PROP: Final = re.compile(r"([\w\-]+)=\"([^\"]*)\"")
EXTINF_MATCH: Final = re.compile(protocol.extinf + r':(?P<duration>-?\d+(\.\d+)?)(?P<props>( ' + EXTINF_PROP.pattern + ')*)(,(?P<title>.+))?')

def parse_iptv_attributes(line: str, lineno: int, data: dict[str, Any], state: dict[str, Any]) -> bool:
    if not (match := EXTINF_MATCH.match(line)):
        return False

    if 'segment' not in state:
        state['segment']: dict[str, dict[str, Any]] = {}

    state['segment']['duration'] = float(match.group('duration'))
    state['segment']['title'] = match.group('title')

    additional_props: dict[str, str] = dict(EXTINF_PROP.findall(match.group('props')))
    save_segment_custom_value(state, 'extinf_props', additional_props)
    state['expect_segment'] = True

    return True

[ the rest is same as in the original example ]

For the purpose of copyright I declare the above code public domain. You're free to use it however you like.