GPU train - Githubissues

torrvision / crfasrnn

This repository contains the source code for the semantic image segmentation method described in the ICCV 2015 paper: Conditional Random Fields as Recurrent Neural Networks. http://crfasrnn.torr.vision/

Other

1.34k stars 462 forks source link

GPU train #66

Open ys404314023 opened 8 years ago

ys404314023 commented 8 years ago

hello! Thank you very much for sharing the source code for us to learn！ I can train my data by using your source code on CPU.But it is too slow. So I want to train my data on GPU. I referred to https://github.com/hyenal/crfasrnn.But there are a few problems that I can't solve.

So I would like to ask you if there is a demo that i can train data on the GPU ? Thank you very much!

ys404314023 commented 8 years ago

And I would like to make sure that the version you are now offering is not supported by GPU?

I trained my data in GTX1080 (8G memory), error: out of memory. thank you!

thuanvh commented 8 years ago

In my experience, you need to reduce size of some layers. For example from 512x512 to 256x256.

2016-09-06 22:23 GMT+07:00 ys404314023 notifications@github.com:

And I would like to make sure that the version you are now offering is not supported by GPU?

I trained my data in GTX1080 (8G memory), error: out of memory. thank you!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/torrvision/crfasrnn/issues/66#issuecomment-244986849, or mute the thread https://github.com/notifications/unsubscribe-auth/ADTrWwAgzCXGa46wtLcoflhNSs3aZWoCks5qnYVfgaJpZM4J187z .

ys404314023 commented 8 years ago

hi ， I want to train my data by your code , but there are some difficulties here I can't solve.

I change the pixel size of my image to 300*300. then i make it to LMDB by https://github.com/martinkersner/train-CRF-RNN . then i train my data by caffe/example/segmentationcrfasrnn/TVG_CRFRNN_new_traintest. i can train my data on GPU. but the answer is wrong when i train it about 180000 iteration.

why the answer is wrong?if i use the wrong net file?or the number of iteration is not enough?

thank you !

KleinYuan commented 7 years ago

@bittnt Hey, what kinda machine you guys used to train in the paper? I cannot find it. We are trying to train a model with GPU and wondering what kind of machine we should put up with. Especially, on memory size. It looks like running the example would easily go up to more than 15G memory consumption on CPU and our training got killed in the mid with 980 ti.

Really appreciate if any suggestions can be given.

ys404314023 commented 7 years ago

maybe you can reduce the batch_size.

KleinYuan commented 7 years ago

@ys404314023 But for segmentation, the batch size is already 1.

KleinYuan commented 7 years ago

@ys404314023 thanks for your help and I eventually figure it out, also organizing some documents here. It's like a demo/documents of training on GPU. Works for me. Hope it may help you.