LAION-AI / Open-Assistant

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
https://open-assistant.io
Apache License 2.0
36.83k stars 3.21k forks source link

Update WizardLM dataset #3121

Open olliestanley opened 1 year ago

olliestanley commented 1 year ago

SFT-8 training is using SFT-8 training is using a slightly less cleaned version

Beyond SFT-8 we should replace with the newer, more cleaned version

I think this is just a matter of changing the HF dataset ID

CloseChoice commented 1 year ago

Is this still a current issue? I tried to update the dataset but the whole structure of the dataset changes. Am also not sure if this does not include vicuna data, which might result in duplicates if training with wizardlm + vicuna