pbs / pycaption

Python module to read/write popular video caption formats
Apache License 2.0
256 stars 136 forks source link

timestamp output does not included two or more ASCII digits for hour value in some cases #165

Closed kdHub closed 2 years ago

kdHub commented 6 years ago

Expected Behavior

When hours are included in timestamp output transitioning from 00 hours to 01 hours should be in the 01:00:00.000 format based on spec and optionally included when no hour is included. Finding in some cases 1:00:00.000 is found in output of scc to webvtt conversion.

Current Behavior

1:00:00.000 is outputted causing parse error vs 01:00:00.000

Possible Solution

def _timestamp(self, ts):
        td = datetime.timedelta(microseconds=ts)
        mm, ss = divmod(td.seconds, 60)
        hh, mm = divmod(mm, 60)
        s = "%02d:%02d.%03d" % (mm, ss, td.microseconds/1000)
        s = "%02d:%s" % (hh, s)
        return s

Steps to Reproduce

Source File: https://gist.github.com/kdHub/d6f4ce9f34968d2807ae69101338026a

  1. Using python3 and 1.0.1 pycaption version convert scc above to webvtt

Context (Environment)

Issue causes parsing failures when packaging.

Detailed Description

From https://www.w3.org/TR/webvtt1/#webvtt-timestamp

WebVTT timestamp consists of the following components, in the given order: Example: Optionally (required if hours is non-zero): Two or more ASCII digits, representing the hours as a base ten integer.****

Recommendation is to output 00:00:00.000 format at all times as that appears to fix it locally.

Iulia-Mada commented 3 years ago

@kdHub Hi, I am having a look over this issue and unfortunately I can not acces the source file provided: https://gist.github.com/kdHub/d6f4ce9f34968d2807ae69101338026a and it is not very clear for me when this error occurs. Does the error occur in pycaption or in the context you use it? Thanks!

ana-nichifor commented 2 years ago

Thanks for reporting this, we added zero padding for 1-digit hours on pycaption 2.0.1