Open AugustDev opened 2 days ago
Hi @AugustDev sorry that it failed at ~80%.
Btw, were you using use_checkpoint = True
? It can help you in case of any failure.
And, The config isn't consistent between chunks
; it should have printed config
and data[config]
that mismatched. If logs are still available, can you check what's the cause of the mismatch?
Yes, LitData encodes each leaf of the pytree as a single object and therefore, it doesn't know this is a single sample.
You can convert it to numpy or torch tensor directly to inform LitData this is a single item and not a list of items.
I was processing large files and received the following error. It failed at around ~80% of the data after about ~1h 20min. The full error is really long, but this is the beginning of it. I'm essentially storing 5 columns where the type of each column is a numpy array. Arrays are of variable length.
🐛 Bug
To Reproduce
Unfortunately I'm not sure how to show how to reproduce without sharing ~100gb dataset.
Additional context
Environment detail
- PyTorch Version: 2.4.1 - OS (e.g., Linux): Debian 11 - Lit data version: 0.2.26 - Python version: 3.10