huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
131.71k stars 26.22k forks source link

Arrow files too large when training SegFormer #32010

Open MRNOBODY-ZST opened 1 month ago

MRNOBODY-ZST commented 1 month ago

System Info

Who can help?

@amyeroberts

Information

Tasks

Reproduction

Make a custon dataset to train SegFormer I'm using a dataset around 33000 images, 26000 for training 7000 for validation When training started, the transformer will continuously writing cache files (with speed around 50mb/s) to C:/Users//.cache/huggingface/metrics/mean_io_u/default Which name is default_experiment-1-0.arrow When it reaches 50 epochs, the file size is about 707GB. I don't know why.

Expected behavior

Disk space drainage, the training can't be continued.

qubvel commented 1 month ago

Hi @MRNOBODY-ZST Thanks for reporting! Let me check this!

MRNOBODY-ZST commented 1 month ago

@qubvel Really Appreciate This!!! By the way, i'm using the training script in this repo.