Open KarimReefat opened 5 years ago
i am sorry i still know nothing about git to use it to add this code to the repository, so here is my code to fix the problem of order numbers of the caption:
num = 0
def order_number(matchobj):
""" this function use to replace . with , and add the number before the time. """
global num
num += 1
return '{0}\n'.format(num) + matchobj.group(0).replace('.', ',')
def convert_content(file_contents):
"""Convert convert of vtt file to str format
Keyword arguments:
file_contents
"""
replacement = re.sub(r"(\d\d:\d\d:\d\d).(\d\d\d) --> (\d\d:\d\d:\d\d).(\d\d\d)(?:[ \-\w]+:[\w\%\d:]+)*\n", order_number, file_contents)
replacement = re.sub(r"(\d\d:\d\d).(\d\d\d) --> (\d\d:\d\d).(\d\d\d)(?:[ \-\w]+:[\w\%\d:]+)*\n", order_number, replacement)
replacement = re.sub(r"(\d\d).(\d\d\d) --> (\d\d).(\d\d\d)(?:[ \-\w]+:[\w\%\d:]+)*\n", order_number, replacement)
replacement = re.sub(r"WEBVTT\n", "", replacement)
replacement = re.sub(r"Kind:[ \-\w]+\n", "", replacement)
replacement = re.sub(r"Language:[ \-\w]+\n", "", replacement)
replacement = re.sub(r"<c[.\w\d]*>", "", replacement)
replacement = re.sub(r"</c>", "", replacement)
replacement = re.sub(r"<\d\d:\d\d:\d\d.\d\d\d>", "", replacement)
replacement = re.sub(r"::[\-\w]+\([\-.\w\d]+\)[ ]*{[.,:;\(\) \-\w\d]+\n }\n", "", replacement)
replacement = re.sub(r"Style:\n##\n", "", replacement)
return replacement
correct me as much as you want.
Hi Karim. Thats nice! Only thing I can point out to improve your code is that the variable
num` should not be global.
1- according to other web pages the srt files should have a caption sequence before the timecode like this:
5 00:00:16,920 --> 00:00:22,470 You can think of it as the opposite to call the So-Cal
6 00:00:22,470 --> 00:00:27,750 other devices middes light a will is designed so that you hack into it.
7 00:00:27,750 --> 00:00:31,060 So it's designed for people who want to learn.
but this is not happening when i use your vtt-to-srt library.
this can be avoided when i use this library: https://github.com/lbrayner/vtt-to-srt
2- is there any problem in using the code in this library to create the vtt files he already use html2text , pysrt , webvtt-py libraries to do this ??