Closed shriakhilc closed 7 years ago
Also, could someone expand on the purpose of the program? "Generate a summary of any video through its subtitles." is quite short and vague.
From my tests, I saw that when the -i
and -s
parameters are used on a video of around 30 minutes, it compressed it to a video of around 1 minute by omitting a lot of scenes. Including the manner in which the scenes in the final video are selected would let developers confirm whether the output is as expected, without having to crawl through and understand the actual code being used to do the compression. (Though the code is quite small in this case, and manual reading is possible)
On the other hand, using the -u
flag takes a YouTube video URL and downloads that video, plus subtitles in either srt
or vtt
formats. But it doesn't actually generate a summary video. And since the subtitles must be in srt
for the -s
flag, I don't see why it is downloading the vtt
file at all.
Thanks in advance for explaining, it'll help make PRs and changes more efficiently and according to your idea of the program use rather than my vague interpretation of it.
Fixed in PR #27
I tried running the program as described in the README with a sample .avi file and a .srt file (both english). The first error I faced was a
UnicodeDecodeError
, the error log of which is as follows:After checking out the pysrt readme, I realized it could be because of an encoding mismatch. Running the file with
pysrt.open(srt_filename, encoding='iso-8859-1')
fixed this error. One easy way to fix this is to first use thechardet
module to detect the encoding, and then pass that to pysrt.Next, I also faced a
LookupError
next due to theNLTK Tokenizer
not being able to findpunkt
. I recommend adding this as a requirement, along with the youtube_dl library which was needed when the-u
flag was being used. Are you developing the file primarily on Linux? That might explain why you didn't notice the need, since most of these might be pre-installed in it.I can make the necessary changes and send a pull request in a couple of hours. Will that be fine?