⚠️ Please check that this feature request hasn't been suggested before.
[X] I searched previous Ideas in Discussions didn't find any similar feature requests.
[X] I searched previous Issues didn't find any similar feature requests.
🔖 Feature description
The current GCS dataset schema (gs:// and gcs://) doesn't work for Puree because Axolotl expects it to be Huggingface saved dataset (saved using save_to_disk method from datasets library)
✔️ Solution
Add a new schema for Puree's dataset (puree://) with it's own handling logic.
❓ Alternatives
No response
📝 Additional Context
No response
Acknowledgements
[X] My issue title is concise, descriptive, and in title casing.
[X] I have searched the existing issues to make sure this feature has not been requested yet.
[X] I have provided enough information for the maintainers to understand and evaluate this request.
⚠️ Please check that this feature request hasn't been suggested before.
🔖 Feature description
The current GCS dataset schema (
gs://
andgcs://
) doesn't work for Puree because Axolotl expects it to be Huggingface saved dataset (saved usingsave_to_disk
method fromdatasets
library)✔️ Solution
Add a new schema for Puree's dataset (
puree://
) with it's own handling logic.❓ Alternatives
No response
📝 Additional Context
No response
Acknowledgements