We currently use HF Datasets to load datasets from HF Hub. Their recommended method requires the entire dataset to fit in memory. If not, then our dataset controller will likely run out of memory and crash.
This hasn't been observed yet, but is considered inevitable.
We currently use HF Datasets to load datasets from HF Hub. Their recommended method requires the entire dataset to fit in memory. If not, then our dataset controller will likely run out of memory and crash.
This hasn't been observed yet, but is considered inevitable.