dhlab-epfl / dhSegment

Generic framework for historical document processing
https://dhlab-epfl.github.com/dhSegment
GNU General Public License v3.0
370 stars 116 forks source link

Taking too much time in training #42

Open CS-savvy opened 5 years ago

CS-savvy commented 5 years ago

Model taking 2 hrs for one epoch having 2300 images and batch size is 1, but you guys have mentioned it took only 4 hrs to train page detection model which contains 1600 images for 30 epochs.

can someone tell me the reason?

solivr commented 5 years ago

Can you give your GPU specs ?

CS-savvy commented 5 years ago

Thanks for replying @solivr I am using Azure VM to train dhSegment model having NVIDIA-driver 390.116 GPU - Tesla K80 - 11 Gb gpu with 6 vcpu and 56 GB ram.

https://www.techpowerup.com/gpu-specs/tesla-k80.c2616

GPU - gpu_config

CPU - cpu_config

jalvathi commented 3 years ago

@solivr @CS-savvy are there any updates here?? My instances are crashing due to the fact of memory even though I use 4 GPUs of 16 GB each. Can anyone suggest any improvement??