walkwithfastai / walkwithfastai.github.io

Host for https://walkwithfastai.com
Other
143 stars 53 forks source link

Running models takes hours #19

Closed craine closed 3 years ago

craine commented 3 years ago

Since the issue yesterday and today, my models just running resnet50 on an image set is taking 4 hours an epoch. It used to take about 4 minutes. I'm running GPU and higher RAM on colabs. I've verified GPU is running. Is something up?

muellerzr commented 3 years ago

I can't do much without a reproducer, please provide one.

craine commented 3 years ago

https://colab.research.google.com/drive/1OAftYeIC8iInzSCwYfuZEx9z83hZXomq?usp=sharing

muellerzr commented 3 years ago

Try moving your data out of Google Drive. For the last few months google drive has throttled access to data on it, so it can definitely slow down your training time as a result.

craine commented 3 years ago

I'll test out vanilla fastai to see if it is still the same.. that be helpful? I've used this setup for almost a year and this is the first time I've ever seen this.

muellerzr commented 3 years ago

That would indeed be helpful.

craine commented 3 years ago

Appears to be the same with vanilla.

muellerzr commented 3 years ago

Yeah, that's not a fastai (or wwf) issue, just google and their security. Not sure why some folks it takes quite a bit for this throttle to happen

craine commented 3 years ago

Just curious.. what do you use for your setup? Kaggle throttles my GPU too.. so it's a huge pain all the way around.

muellerzr commented 3 years ago

Local when I can, or colab. If I'm using something that has a large amount of data in the drive, I bring it to /content if possible

muellerzr commented 3 years ago

In your case though, since most kaggle competitions have hundreds of gigs of data, I train on the Kaggle GPU's when I don't on local