Closed abidlabs closed 2 years ago
Is this supposed to be like a fake gif? Rendering a series of images one after the other with a small delay between them?
Yeah somewhat like that. The basic idea is to render a sequence of images (or more generally any output) as they are returned by a function instead of returning a GIF all at once at the end. You could imagine a loop that returns a 100 images, but it takes a while to generate each one. In Python, you have generators, which can yield multiple items instead of a function which returns a single item. We could potentially use that in the backend. In the frontend, the interface would have to make repeated calls to the backend and update the output component each time.
The other common use case where we've herad that we've seen is generating GANs, which is a slow an sequential process, and one can return "intermediate" images as a cool visualization.
Starting to feel that we should be setting up a websocket connection with the backend rather than polling, it wouldn't make stuff like this much easier for realtime communication between client + server.
Potentially related: https://discuss.huggingface.co/t/can-we-print-run-time-outputs/15810/2
Would love to use this for the min-dalle space: https://huggingface.co/spaces/kuprel/min-dalle
Here's how that model works with progressive outputs: https://replicate.com/kuprel/min-dalle
We should also be mindful to support all components (e.g. Text) and not just images. Comment from @NotNANtoN :
Comment from https://github.com/gradio-app/gradio/issues/1973#issuecomment-1208338917
Given how much gradio is being used now with the explosion of stable diffusion, I would love for this to be included in future releases. Seeing a loading progress bar is one thing, but seeing your image getting updated is a whole another experience (Midjourney bot has something similar)
Yess absolutely, we'll prioritize this next week!
Ok so here's what I'm thinking for iterative outputs.
If a user returns a generator (a function with a yield
statement instead of return
), then Gradio automatically produces an iterative output component.
For example, the following code:
images = ["im1.jpg", "img2.jpg", "img3.jpg"]
def gan():
for image in images:
yield imge
This has several advantages over coming up with a Gradio-specific iteration method:
next()
Iterative outputs will require queue enabled. Queueing will become the default anyways in 3.3 as a result of #2081.
The basic idea is that when the frontend gets the {"msg", "send_data"}
, it will send the user's data as usual to the backend. When a queue processes a regular function (inspect.isgeneratorfunction
is False
), then it simply returns the standard message:
{"msg": "process_completed", "output"...}
However, when the queue processes a generator, then it continues iterating until the generator terminates. At each step of iteration (before termination), the generator sends the following message back:
{"msg": "process_iterating", "output"...}
And then calls the prediction function again (using next()
repeatedly on the generator) until termination, when it sends the standard message back to the frontend:
{"msg": "process_completed", "output"...}
When the frontend receives a "process_iterating", it keeps the websocket connection alive and simply updates the output components with the intermediate data.
when the queue processes a generator, then it continues iterating until the generator terminates in the iterative output implementation proposal. How do we do that given that the queue processes events by sending a POST request to the predict endpoint?
Yes, good point. The iteration will have to happen inside the Blocks.run_predict() function, I think specifically inside Blocks.call_function() method, which will be slightly modified if the function is actually a generator. And if it is a generator, then it will return a flag which will tell the queueing endpoint to keep calling the /api/predict endpoint. The current "state" of the generator will have to be stored for every user, and that can be done with the "session_hash" key. I think that should do it?
I would also strongly support these features. Is there a workaround for the time being?
No real workaround besides assembling e.g. all of your output images and displaying them as a GIF at the end.
But we're working on it and should have a prototype this week!
We've heard this request in the past, and just heard this from a user at Hugging Face, who wanted to update the images in an image output component iteratively.
From the user:
Space they linked: https://huggingface.co/spaces/edbeeching/atari_live_model
Can we support this in Blocks?