gradio-app / gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
http://www.gradio.app
Apache License 2.0
30.37k stars 2.26k forks source link

Adding subtitles to `gr.Video` #3673

Closed tomchang25 closed 1 year ago

tomchang25 commented 1 year ago

Description

Add subtitle feature to video, remain in draft until enough discussion.

Closes: #2945

Checklist:

A note about the CHANGELOG

Hello 👋 and thank you for contributing to Gradio!

All pull requests must update the change log located in CHANGELOG.md, unless the pull request is labeled with the "no-changelog-update" label.

Please add a brief summary of the change to the Upcoming Release > Full Changelog section of the CHANGELOG.md file and include a link to the PR (formatted in markdown) and a link to your github profile (if you like). For example, "* Added a cool new feature by [@myusername](link-to-your-github-profile) in [PR 11111](https://github.com/gradio-app/gradio/pull/11111)".

If you would like to elaborate on your change further, feel free to include a longer explanation in the other sections. If you would like an image/gif/video showcasing your feature, it may be best to edit the CHANGELOG file using the GitHub web UI since that lets you upload files directly via drag-and-drop.

gradio-pr-bot commented 1 year ago

🎉 The demo notebooks match the run.py files! 🎉

gradio-pr-bot commented 1 year ago

All the demos for this PR have been deployed at https://huggingface.co/spaces/gradio-pr-deploys/pr-3673-all-demos

abidlabs commented 1 year ago

Thanks @tomchang25 for this PR! I think adding support for subtitles for the Video component is a good idea. It's been requested by the community frequently and it's good for accessibility.

I've reviewed the backend, generally looks good, left a few minor comments. My main concern is adding pysub2 as a dependency. We'd like to limit additional dependencies in the gradio package. Is there a way to support .srt files without adding another dependency? If not, I think we should stick to whichever formats the browser supports.

cc @aliabid94 @pngwn for your thoughts, and could you guys take a look at the Svelte code?

tomchang25 commented 1 year ago

Fix

My main concern is adding pysub2 as a dependency. We'd like to limit additional dependencies in the gradio package.

I used pysub2 so I can easily support other format in the future. I think I can rewrite it in a simple function because VTT and SRT converstion is pretty easy. As for other format like xml may need much time to deal with, but since these two format cover about 90% application. It's may not a big problem.

In addition, I find a bug that subtitles will not reset if changing the video and subtitles but has not changed the type of screen mode.

2023-03-30_231621

I will check this issue in the next few days.

abidlabs commented 1 year ago

Thanks @tomchang25 for addressing the issues above! Will take another look

aliabid94 commented 1 year ago

Thanks for the PR @tomchang25, I think the community will really appreciate this one. My concern is with the API, in the the subtitle track is considered part of the component rather than part of the value itself. This is problematic because if a method returns a new value for the video without setting the subtitle track, it will actually use the previous subtitle track. The subtitle track should be couples to an individual video, not the video component. I believe the best solution would to allow the Video component to accept a Tuple, in which case the second value would be the subtitle track. (@abidlabs any thoughts on this?)

Happy to take over and make these changes to get this PR across the finish line.

abidlabs commented 1 year ago

Good catch @aliabid94. Agree that the proposed format is better, and it's kind of consistent to (images, alt_text) in the gr.Chatbot component, for example.

tomchang25 commented 1 year ago

Hello, @abidlabs

The Tuple version is now done. I believe most of the issues have been addressed. However, I am unable to fix the bug where the subtitles are not reset if the video is changed before resetting the subtitles. The subtitles will use the previous subtitle track in this case. For some unknown reason, if the video source is changed without setting the subtitle to None, it will cause an issue with the subtitle cannot be edited by the subtitle track component.

I'm not sure if this bug is caused by the browser or Svelte. Nonetheless, since this bug is not impacting the functionality significantly, it can be easily fixed by rescaling the video.

Perhaps we can create an issue post and merge it into the main branch first?

Thank you.

abidlabs commented 1 year ago

Let me take a look @tomchang25!

abidlabs commented 1 year ago

Hi @tomchang25 I tested the demo but it did not work for me. There were some, I believe accidental errors like passing the block argument to launch(). Even after fixing those, when I run the demo, I get a blank white screen:

image

Does the demo work for you locally? Maybe there were some local changes that you haven't pushed yet?

tomchang25 commented 1 year ago

Hi, @abidlabs

Thank you for your feedback. I have reviewed all the files and confirmed that there are no missing files. Additionally, I have cloned the pure branch, reinstalled it, and rebuilt the front-end, and it appears to be working fine on my end. However, I understand that the issue may be specific to my local environment, which is running Windows 10.

In any case, I have modified the demo to use gr.Interface. I hope this resolves the issue.

abidlabs commented 1 year ago

Hmm interesting I'm testing on Windows as well. Can you try running this example and let me know if it works for you?

import gradio as gr

gr.Interface(lambda x:x, "video", "video").launch()

When I submit a video, I get the following error:


Traceback (most recent call last):
  File "c:\users\islam\dev\gradio-repos\gradio\gradio\routes.py", line 395, in run_predict
    output = await app.get_blocks().process_api(
  File "c:\users\islam\dev\gradio-repos\gradio\gradio\blocks.py", line 1191, in process_api
    inputs = self.preprocess_data(fn_index, inputs, state)
  File "c:\users\islam\dev\gradio-repos\gradio\gradio\blocks.py", line 1039, in preprocess_data
    processed_input.append(block.preprocess(inputs[i]))
  File "c:\users\islam\dev\gradio-repos\gradio\gradio\components.py", line 2038, in preprocess
    video = x[0]
KeyError: 0
tomchang25 commented 1 year ago

I have checked another PC that has never had Gradio installed before, and the components, subtitles, and Gr.Interface demos are all working fine.

Have you remembered to update the front-end? I made a significant change to the front-end value format from FileData | null to Array(FileData) | null. It seems like the front-end is still returning FileData.

abidlabs commented 1 year ago

Hi @tomchang25 you're completely right, my bad! I was rebuilding the frontend, but the job was silently failing and didn't realize it! The original video demos as well as the demo that you added work well.

There is still an issue around serialization with the new format (you can try this by flagging a video with subtitles). I'll fix that and then I think we should be good to merge.

tomchang25 commented 1 year ago

Great, @abidlabs . I really appreciate you taking the time to troubleshoot and fix the issue. By the way, once you've finished merging everything, I'll create an issue post about the subtitle not resetting when changing the video source. Thanks again for your help!

abidlabs commented 1 year ago

All right, I think everything is good to go now with this PR! I did some additional cleanup as well. I'll leave it open for a few hours in case @aliabid94 or @pngwn would like to give it another look, or if you'd like to review it as well @tomchang25

tomchang25 commented 1 year ago

Thank you for keeping the work going for a few more hours. I've reviewed the component section and it looks much better now. Nice job! I'm glad to hear that the PR has been completed.

aliabid94 commented 1 year ago

I'm guessing the browser does some caching, lemme take a look

aliabid94 commented 1 year ago

Also is the multiline supposed to render the subtitles in different lines? I see visible \n in the captions

Screen Shot 2023-04-11 at 1 58 41 PM
abidlabs commented 1 year ago

Also is the multiline supposed to render the subtitles in different lines? I see visible \n in the captions

Oh yeah I noticed that as well. Forgot to mention. Not sure if this is really needed? I feel like new lines could clutter the UI

aliabid94 commented 1 year ago

? I feel like new lines could clutter the UI

If a user intends to make something multiline, it should be. Looking into this.

aliabid94 commented 1 year ago

Actually its as simple as just adding an actual new line in the subtitle file, instead of using \n

Screen Shot 2023-04-11 at 2 24 38 PM
aliabid94 commented 1 year ago

Believe I fixed the subtitles issue, I think it was an issue with chrome if you changed video srcs for an element, the subtitles of the previous video would freeze onto the new video. Came up with a (hacky?) solution to clear the element between rerenders, @pngwn if you can give feedback when you get back on this commit, that'd be great.

Other than a few comments I left above, this PR lgtm!

aliabid94 commented 1 year ago

I'll go ahead and make the changes I requested.

aliabid94 commented 1 year ago

Merged! Thanks @tomchang25

abidlabs commented 1 year ago

Thanks @tomchang25 for initiating this and thanks @aliabid94 for fixing the remaining issues. Solid PR!