bhushanap / captomate

A repository that automates tiktok style captioned videos
GNU Lesser General Public License v3.0
5 stars 2 forks source link

Help & gradio version #2

Open otomay opened 1 month ago

otomay commented 1 month ago

Hey bro! Awesome project, I was looking for something like that for a long time!

Firstly: IDK why, but gradio was not working in this repo version. It was throwing pydantic.errors.PydanticSchemaGenerationError: Unable to generate pydantic-core schema for <class starlette.requests.Request> everytime I tried to click a button. So I installed a newer version with pip install -U gradio==4.43.0 and it worked.

and the ask for help: can you maybe create something like a "export settings" button in the UI, so we can use the config created in a script? Like an option to use the UI like a builder, but having the manners to do via script.

it can be with arg parsing like python launch.py --video=video.mp4 --config=config.yml or a file to import in another modules:

from captomate import make_video

make_video("video.mp4", "config.yml")

I tried to understand the code and the pyonx lib, but it's pretty advanced for me (っ˘̩╭╮˘̩)っ otherwise I could help.

bhushanap commented 1 month ago

If I understand your request correctly, you can export the config file by pressing either of Preview Subtitles or Generate button. The user/cfg/config.yml based on the Gradio UI state will be saved if gradio runs properly (regardless of any errors in rest of the generation)

The make_video function you propose could have multiple input arguments (it can take audio, image, subtitle, etc). So instead of making it complex with variable arg parsing and then adding conditional checks, I decided to include all of the paths for these in the config.yml file.

If you want, you could make a custom function as follows:

def make_video("video.mp4", "config.yml"):
    #add video.mp4 to video input in config file
    config_dict = load_yml.load_config("config.yml")
    config_dict.['video']['input'] = "video.mp4"
    with open(os.path.join('config.yml'), 'w') as yaml_file:
      yaml.dump(params, yaml_file, default_flow_style=False)

    #add video.mp4 to gpath variable used by get_generate
    gpath = {'video':"video.mp4"}
    # gpath is used to reference gradio file-paths of the uploaded files in gradio
    # if you are not using audio and subtitle files, I don't think you would need any gpath['audio], gpath['subtitle']
    # ex. say you are keeping the audio from the video and generating subtitles from the same audio
    get_generate('user',gpath)

I haven't tried it, but something like this should work. Let me know if you meant something else, or have any issues with this.

otomay commented 1 month ago

Hey, thanks for answering! Just tested it and I can't get it to work. I leave it like that:

config_dict = loadyml.load_config()
config_dict['video']['input'] = "video.mp4"
with open(os.path.join('config.yml'), 'w') as yaml_file:
  yaml.dump(params, yaml_file, default_flow_style=False)

#add video.mp4 to gpath variable used by get_generate
gpath = {'video':"video.mp4"}
# gpath is used to reference gradio file-paths of the uploaded files in gradio
# if you are not using audio and subtitle files, I don't think you would need any gpath['audio], gpath['subtitle']
# ex. say you are keeping the audio from the video and generating subtitles from the same audio
get_generate('user',gpath)

and got TypeError: get_generate() missing 33 required positional arguments: 'radioS', 'vd', 'yt', 'res', 'vid', 'tts', 'spk', 'aud', 'sub', 'augment', 'font', 'tcolor', 'fontSize', 'outline', 'ocolor', 'osize', 'shadow', 'scolor', 'ssize', 'position', 'effect', 'words', 'chars', 'time', 'lines', 'vSpacing', 'effect_color', 'effect_outline_color', 'effect_scale', 'bright', 'con', 'sat', and 'out'

hehehe

I think I should pass the loaded config to get_generate somehow?

bhushanap commented 1 month ago

Hi! I changed some files and added a script that works as intended above.

Add captomate to your python path

PYTHONPATH=$PYTHONPATH:~/captomate

Run the script

python3 -m captomate_script --video_path 'video.mp4' --config_path 'config.yml' --audio_path 'audio.mp3' --subtitle_path 'subtitle.srt'

Audio and subtitle are optional.

This was tested by running with some files and seemed to work on my device. Let me know if you encounter any issues.