Closed tomchang25 closed 1 year ago
🎉 The demo notebooks match the run.py files! 🎉
All the demos for this PR have been deployed at https://huggingface.co/spaces/gradio-pr-deploys/pr-3673-all-demos
Thanks @tomchang25 for this PR! I think adding support for subtitles for the Video
component is a good idea. It's been requested by the community frequently and it's good for accessibility.
I've reviewed the backend, generally looks good, left a few minor comments. My main concern is adding pysub2
as a dependency. We'd like to limit additional dependencies in the gradio package. Is there a way to support .srt
files without adding another dependency? If not, I think we should stick to whichever formats the browser supports.
cc @aliabid94 @pngwn for your thoughts, and could you guys take a look at the Svelte code?
Fix
My main concern is adding pysub2 as a dependency. We'd like to limit additional dependencies in the gradio package.
I used pysub2 so I can easily support other format in the future. I think I can rewrite it in a simple function because VTT and SRT converstion is pretty easy. As for other format like xml may need much time to deal with, but since these two format cover about 90% application. It's may not a big problem.
In addition, I find a bug that subtitles will not reset if changing the video and subtitles but has not changed the type of screen mode.
I will check this issue in the next few days.
Thanks @tomchang25 for addressing the issues above! Will take another look
Thanks for the PR @tomchang25, I think the community will really appreciate this one. My concern is with the API, in the the subtitle track is considered part of the component rather than part of the value itself. This is problematic because if a method returns a new value for the video without setting the subtitle track, it will actually use the previous subtitle track. The subtitle track should be couples to an individual video, not the video component. I believe the best solution would to allow the Video component to accept a Tuple, in which case the second value would be the subtitle track. (@abidlabs any thoughts on this?)
Happy to take over and make these changes to get this PR across the finish line.
Good catch @aliabid94. Agree that the proposed format is better, and it's kind of consistent to (images, alt_text) in the gr.Chatbot
component, for example.
Hello, @abidlabs
The Tuple version is now done. I believe most of the issues have been addressed. However, I am unable to fix the bug where the subtitles are not reset if the video is changed before resetting the subtitles. The subtitles will use the previous subtitle track in this case. For some unknown reason, if the video source is changed without setting the subtitle to None, it will cause an issue with the subtitle cannot be edited by the subtitle track component.
I'm not sure if this bug is caused by the browser or Svelte. Nonetheless, since this bug is not impacting the functionality significantly, it can be easily fixed by rescaling the video.
Perhaps we can create an issue post and merge it into the main branch first?
Thank you.
Let me take a look @tomchang25!
Hi @tomchang25 I tested the demo but it did not work for me. There were some, I believe accidental errors like passing the block
argument to launch()
. Even after fixing those, when I run the demo, I get a blank white screen:
Does the demo work for you locally? Maybe there were some local changes that you haven't pushed yet?
Hi, @abidlabs
Thank you for your feedback. I have reviewed all the files and confirmed that there are no missing files. Additionally, I have cloned the pure branch, reinstalled it, and rebuilt the front-end, and it appears to be working fine on my end. However, I understand that the issue may be specific to my local environment, which is running Windows 10.
In any case, I have modified the demo to use gr.Interface. I hope this resolves the issue.
Hmm interesting I'm testing on Windows as well. Can you try running this example and let me know if it works for you?
import gradio as gr
gr.Interface(lambda x:x, "video", "video").launch()
When I submit a video, I get the following error:
Traceback (most recent call last):
File "c:\users\islam\dev\gradio-repos\gradio\gradio\routes.py", line 395, in run_predict
output = await app.get_blocks().process_api(
File "c:\users\islam\dev\gradio-repos\gradio\gradio\blocks.py", line 1191, in process_api
inputs = self.preprocess_data(fn_index, inputs, state)
File "c:\users\islam\dev\gradio-repos\gradio\gradio\blocks.py", line 1039, in preprocess_data
processed_input.append(block.preprocess(inputs[i]))
File "c:\users\islam\dev\gradio-repos\gradio\gradio\components.py", line 2038, in preprocess
video = x[0]
KeyError: 0
I have checked another PC that has never had Gradio installed before, and the components, subtitles, and Gr.Interface demos are all working fine.
Have you remembered to update the front-end? I made a significant change to the front-end value format from FileData | null to Array(FileData) | null. It seems like the front-end is still returning FileData.
Hi @tomchang25 you're completely right, my bad! I was rebuilding the frontend, but the job was silently failing and didn't realize it! The original video demos as well as the demo that you added work well.
There is still an issue around serialization with the new format (you can try this by flagging a video with subtitles). I'll fix that and then I think we should be good to merge.
Great, @abidlabs . I really appreciate you taking the time to troubleshoot and fix the issue. By the way, once you've finished merging everything, I'll create an issue post about the subtitle not resetting when changing the video source. Thanks again for your help!
All right, I think everything is good to go now with this PR! I did some additional cleanup as well. I'll leave it open for a few hours in case @aliabid94 or @pngwn would like to give it another look, or if you'd like to review it as well @tomchang25
Thank you for keeping the work going for a few more hours. I've reviewed the component section and it looks much better now. Nice job! I'm glad to hear that the PR has been completed.
I'm guessing the browser does some caching, lemme take a look
Also is the multiline supposed to render the subtitles in different lines? I see visible \n
in the captions
Also is the multiline supposed to render the subtitles in different lines? I see visible \n in the captions
Oh yeah I noticed that as well. Forgot to mention. Not sure if this is really needed? I feel like new lines could clutter the UI
? I feel like new lines could clutter the UI
If a user intends to make something multiline, it should be. Looking into this.
Actually its as simple as just adding an actual new line in the subtitle file, instead of using \n
Believe I fixed the subtitles issue, I think it was an issue with chrome if you changed video srcs for an element, the subtitles of the previous video would freeze onto the new video. Came up with a (hacky?) solution to clear the element between rerenders, @pngwn if you can give feedback when you get back on this commit, that'd be great.
Other than a few comments I left above, this PR lgtm!
I'll go ahead and make the changes I requested.
Merged! Thanks @tomchang25
Thanks @tomchang25 for initiating this and thanks @aliabid94 for fixing the remaining issues. Solid PR!
Description
Add subtitle feature to video, remain in draft until enough discussion.
Closes: #2945
Checklist:
A note about the CHANGELOG
Hello 👋 and thank you for contributing to Gradio!
All pull requests must update the change log located in CHANGELOG.md, unless the pull request is labeled with the "no-changelog-update" label.
Please add a brief summary of the change to the Upcoming Release > Full Changelog section of the CHANGELOG.md file and include a link to the PR (formatted in markdown) and a link to your github profile (if you like). For example, "* Added a cool new feature by
[@myusername](link-to-your-github-profile)
in[PR 11111](https://github.com/gradio-app/gradio/pull/11111)
".If you would like to elaborate on your change further, feel free to include a longer explanation in the other sections. If you would like an image/gif/video showcasing your feature, it may be best to edit the CHANGELOG file using the GitHub web UI since that lets you upload files directly via drag-and-drop.