Closed zihuig closed 2 years ago
Hi @zihuig, I am also facing this issue - I am exposing GPUs to Docker via --gpus all
and it seems to be using them but takes very long to run evaluation. How did you work around this?
Hi @zihuig, I am also facing this issue - I am exposing GPUs to Docker via
--gpus all
and it seems to be using them but takes very long to run evaluation. How did you work around this?
Hi @testzer0, I cleaned the docker cache and increased the batch size to 16, which reduced the evaluation time on the Spider dev set to one hour.
Thanks a lot! That worked.
@zihuig I have run into the same issue. Eval takes very long hours to finish. I tried to increase eval batch size to 4 on my TITAN RTX (24G), but it reported in CUDA memory error at about 30%. It seems the max batch size I can set is 2. I wonder how you cleaned up the docker cache? How you could set batch size to 16 without running into CUDA memory issue?
Hi, @tscholak . Sorry to bother you. I try to use picard (one tesla v100 16GB) on T5-large model, but it seems unreasonably slow, with a single example validation time of 90s. Here's my screenshot.
and the config I use as follows:
Is there any way to make it faster? (The PICARD paper mentions that the decoding speed is 3.1 seconds per sample.) Thank you very much.