kedro-org / kedro-plugins

First-party plugins maintained by the Kedro team.
Apache License 2.0
92 stars 89 forks source link

Add `safetensors` dataset #221

Open astrojuanlu opened 1 year ago

astrojuanlu commented 1 year ago

https://huggingface.co/blog/safetensors-security-audit

🐶Safetensors is a library for saving and loading tensors in the most common frameworks (including PyTorch, TensorFlow, JAX, PaddlePaddle, and NumPy).

import torch
from safetensors.torch import load_file, save_file

weights = {"embeddings": torch.zeros((10, 100))}
save_file(weights, "model.safetensors")
weights2 = load_file("model.safetensors")

Comparison with other formats: https://github.com/huggingface/safetensors#yet-another-format-

For Hugging Face, EleutherAI, and Stability AI, the master plan is to shift to using this format by default.

MinuraPunchihewa commented 1 day ago

Hey @astrojuanlu, Can I give this a shot? I had previously commented on this issue, but I am having trouble with my Hive setup and I would like to tackle it later.

astrojuanlu commented 16 hours ago

Go ahead @MinuraPunchihewa ! Please add it as an experimental dataset