Y-IAB / axolotl

Go ahead and axolotl questions
Apache License 2.0
0 stars 0 forks source link

Add Support for Usage of Puree Dataset on Training #3

Closed rifqiyan closed 8 months ago

rifqiyan commented 8 months ago

⚠️ Please check that this feature request hasn't been suggested before.

🔖 Feature description

The current GCS dataset schema (gs:// and gcs://) doesn't work for Puree because Axolotl expects it to be Huggingface saved dataset (saved using save_to_disk method from datasets library)

✔️ Solution

Add a new schema for Puree's dataset (puree://) with it's own handling logic.

❓ Alternatives

No response

📝 Additional Context

No response

Acknowledgements