nomadkaraoke / python-lyrics-transcriber

Automatically create synchronised lyrics files in ASS and MidiCo LRC formats with word-level timestamps, using Whisper and lyrics from Genius and Spotify, using LLMs / GPT-4 to correct transcribed lyrics
MIT License
32 stars 8 forks source link

using video_background_image, the output video size is 1920x1072 #11

Closed Hwangkop closed 3 months ago

Hwangkop commented 3 months ago

video_resolution="1080p" input video_background_image, the output video size is 1920x1072

Environment: Windows11 Python: 3.10.14 python-lyrics-transcriber: last

beveradb commented 3 months ago

Hmm, I'm not really sure how that can be possible 😅 See the code here: https://github.com/nomadkaraoke/python-lyrics-transcriber/blob/main/lyrics_transcriber/transcriber.py#L117

Can you provide an example of your inputs and the debug logs from running it? Also, the resolution of the image you use as the background image is important!

Hwangkop commented 3 months ago

Hmm, I'm not really sure how that can be possible 😅 See the code here: https://github.com/nomadkaraoke/python-lyrics-transcriber/blob/main/lyrics_transcriber/transcriber.py#L117

Can you provide an example of your inputs and the debug logs from running it? Also, the resolution of the image you use as the background image is important!

The output is good when not using video_background_image, here are the cache and logs

logs.txt

lyrics-transcriber-cache.zip

beveradb commented 3 months ago

The issue is your background image isn't the right resolution - it's 1920 × 1281 rather than 1920 × 1080 😄

Whatever image you use for the background has to be exactly the right resolution for the output video for good results.

Hwangkop commented 3 months ago

So even if the resize_background_image function is used, it will still affect the size of the video output?

beveradb commented 3 months ago

Honestly I'm not sure 😅 for my use case I've been using a 4k background image with the right resolution and it works well for me so I haven't tested thoroughly with other input images (this repo is a hobby project)

Please give it a try with different resolution background images and see if you can figure out how to make it more robust at handling different inputs - PRs very much welcome 😄

Hwangkop commented 3 months ago

This is a nice project, your other projects are also very good. I will try different resolution background images later Thanks for your reply

Hwangkop commented 3 months ago

In Windows the default pixel format is yuv444p, and the property show 1920x1072. When setting pixel format to yuv420p property show 1920x1080 Adding the command: "-pix_fmt", "yuv420p" solved my problem, but I don't have other devices to test, so I won't submit a PR