stanford-crfm / mistral

Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging Face 🤗 Transformers.
Apache License 2.0
562 stars 49 forks source link

Make local dataset configurable #112

Closed teetone closed 2 years ago

teetone commented 2 years ago

Make local dataset configurable and fail fast if missing files for train or validation split.

J38 commented 2 years ago

We should finalize design/interface choices and I can put together a new pull request based on our choices.

Lets move discussion about best interface to #111 please.