Closed pngwn closed 21 hours ago
Ohhh HuggingFaceDatasetSaver
is cool!!!
I kinda regret implementing HuggingFaceDatasetSaver
, as it is a clunky and shallow abstraction. It would have been better to let users write an arbitrary call back function when the Flag
button was clicked. Anyways, in this case, its pretty straightforward to save prompts part of your ChatInterface
function, something like:
from datasets import Dataset, DatasetDict
from huggingface_hub import HfApi, HfFolder
import gradio as gr
prompts = {'text': []}
dataset = Dataset.from_dict(empty_data)
def get_response(prompt, history):
prompts = prompts.add_item(new_sample)
prompts.push_to_hub('your_username/your_text_dataset')
return ...
demo = gr.ChatInterface(
get_response,
)
demo.launch()
I would suggest that we make an example demo and share it, rather than including a relatively shallow abstraction that we will have to maintain
@abidlabs let me try this.
Will there be any issue with asynchronicity? Not sure how multiple queries to get_response
get handled in the back end?
to prevent issues with concurrency, by default, only 1 worker will be running get_response()
at any given time (this can be changed by setting the concurrency_limit
parameter of gr.ChatInterface()
: https://www.gradio.app/docs/gradio/chatinterface
i.e. if one user is getting a submission back, all other users will be waiting in queue
This makes sense. There is some flexibility in this by using streaming maybe (+ a VLLM backend enables async-ness). I was also wondering if VLLM implemented a saving method, both make sense on my side.
I think eventually we'll want to enable more than one worker, hopefully multiple people use our demo's. I'll look.
Yes concurrency_limit=1
is just a default because often a machine will only have the resources to support a single user for ML demos, but in many demos (including lmsys chat arena for example), this is increased to support more users at a time. in which case, you'll want to add a lock around the dataset to ensure no race conditions
Another solution idea that I have, given that I'm modifying the ChatInterface
source is to store all the conversations to an internal variable of the chat interface, and then save prompts every N seconds with another process.
Saving data outside of the predict()
function seems best if I can full it off. Given we have GPUs to enable concurrency.
Or, if I really want this, I should just save the prompts locally in the predict()
function, then periodically upload the results.
Yup agreed, I'm going to close this issue since I think should be handled user-side. @pngwn feel free to reopen if you disagree
I agree that it should be handled in userland conceptually but I think we can make it easier somehow. I'll reopen if I can come up with a decent proposal.
Is your feature request related to a problem? Please describe.
@natolambert
Describe the solution you'd like
Some way to easily save the prompts passed into ChatInterface to a huggingface dataset. Maybe something like:
We already have an API like this for flagging, so if we could reuse the
HuggingFaceDatasetSaver
that would be ideal.