Closed cgebbe closed 4 months ago
it was an out of memory error, but I don't have the logs anymore.
Found it a bit strange that it only happened after several hours. Didn't have other tasks running.
Hey @cgebbe.
We can support this. The writer keeps track of the chunk info there: https://github.com/Lightning-AI/litdata/blob/main/src/litdata/streaming/writer.py#L253 and we have already some logic to merge the index json file: https://github.com/Lightning-AI/litdata/blob/26bf6b2553a0ec72ab77418988e33e4d639f6f85/src/litdata/streaming/writer.py#L395.
In reality, you could even process your dataset by chunk and just combine them at the end.
Would you be interesting in trying to contribute this feature ?
When running
optimize
, my process somehow crashed after 4h (was estimated to take 10h). Now I have to restart it from scratch. Could you add a checkpointing feature such that it automatically continues with the last chunk?