Puyodead1 / udemy-downloader

A Udemy downloader that can download courses, with DRM support.
MIT License
1.26k stars 291 forks source link

[Bug] By default, the program converts downloaded subtitles from .vtt to .srt with a high probability of failure. #197

Closed hughware closed 7 months ago

hughware commented 8 months ago

What happened?

The subtitle file is empty when opened.

Expected Result

The subtitle content should be in the file after downloading, not nothing when you open it.

Branch

master/main

What operating systems are you seeing the problem on?

Windows

Relevant log output

No response

Other information

The following is my proof: the error is caused by the program converting the original .vtt into .srt I don’t know if the program deleted it too quickly or something else, because the subtitles obtained when downloading should be in .vtt format. I can use PotPlayer to display it very well, but the author may have considered the convenience of .srt format. When downloading, First convert .vtt to .srt, and then delete the original .vtt, unless you manually retain the parameters with --keep-vtt

But when I ran it, .srt opened as an empty file after the conversion process.

First, when I run: python main.py -c https://www.udemy.com/course/the-python-mega-course --keep-vtt, I don't get any .vtt subtitles, and there are no subtitles at all.

Then, when I run: python main.py -c <Course URL> --skip-lectures --download-captions, it may generate empty files. I will demonstrate this using a Python program later.

When I run: python main.py -c <Course URL> --skip-lectures --download-captions --keep-vtt, it works fine, and there is no issue.

I compared the English and Chinese subtitles by running the following commands:

python main.py -c <Course URL> --skip-lectures --download-captions
python main.py -c <Course URL> --skip-lectures --download-captions -l zh --keep-vtt
import os
def check_empty_file(file_path):
    with open(file_path, 'r', encoding='utf-8') as file:
        content = file.read()
        if not content.strip():
            print(f"Empty file: {file_path}")

directory = r"E:\out_dir\the-python-mega-course"

for root, dirs, files in os.walk(directory):
    for file in files:
        ext = os.path.splitext(file)[1]
        if ext == ".srt":
            file_path = os.path.join(root, file)
            check_empty_file(file_path)

图片 As you can see it all ends in _en.srt, there are a lot of them.