How to enhance the speed of pre-train step

albertbup / deep-belief-network

A Python implementation of Deep Belief Networks built upon NumPy and TensorFlow with scikit-learn compatibility

MIT License

481 stars 212 forks source link

How to enhance the speed of pre-train step #23

Closed mingliking closed 6 years ago

mingliking commented 7 years ago

Hi I tried your dbn example and it is quite cool since it is the only successful example with DBN model I have found so far. However, I find a problem that since the pre-train is done using numpy, it takes quite long time to do the pretrain step when the input is large. Any ideas about how to speed up it?

albertbup commented 7 years ago

Hi, thanks for your comments.

In principle pre-train step should be done using tensorflow if you use the classes from dbn.tensorflow.modles module. Why do you suspect it's actually not like that?

Albert

mingliking commented 7 years ago

@albertbup Hi, thanks for your reply. I did use the tensorflow.modles and checked the gpustat all the time. However, during the pre-train step the gpu is not used and only fine-tuning part has gpu usage. And I saw the last comment in https://github.com/albertbup/deep-belief-network/issues/7, which I think is the same issue I have here.

albertbup commented 7 years ago

I see. We can try to force tensorflow to compute on gpu, check "Manual Device Placement section", but I'll need your help to test it since I don't have gpu. What do you think?

mingliking commented 7 years ago

@albertbup Sure. I tried it just now and a bit confused about the place I should add the manual device placement section. There are so many classes you have written and I am not sure where I should add the code such as with tf.device('/gpu:0'). Also, I notices that you have written a class called "BaseBinaryRBM" with numpy and this class is used by other classes such as BinaryRBM, and UnsupervisedDBN. Is that partly the reason why gpu does not work in pre-train?

albertbup commented 7 years ago

Hey, I think you need to add with tf.device("/device:GPU:0") at the very beginning of the method _build_model(self, weights=None) from the class BinaryRBM in dbn.tensorflow.models.py file.

Regarding the second question, I don't think what you mention is the source of the problem: UnsupervisedDBN has a field called self.rbm_class that stores what RBM class will be used to build the stack of rbm's. Notice that it does use the BinaryRBM implemented in Tensorflow; in contrast, UnsupervisedDBN from numpy stores the class BinaryRBM for numpy.

mingliking commented 6 years ago

@albertbup I tried your suggestion but failed.

ghost commented 6 years ago

The issue occurred with me was that my gpu was not being used because in requirements file it was mentioned the cpu version of tensorflow. and not the gpu. So changing tensorflow to tensorflow-gpu and installing requirements.txt solved the problem. And higher number of hidden neurons neurons tends to increase the gpu utilization.

mingliking commented 6 years ago

@pallabiiitg Sorry, I am a bit confused on your solution. What do you mean "installing requirement.txt"? Should I install tensorflow-gpu with version 1.0.0? Now I am using tensorflow-gpu 1.3.0.

albertbup commented 6 years ago

I didn't notice about the existence of tensorflow-gpu package. That's a good point to know because the project's requirements uses the "non-gpu" version by default. Knowing that, I'll need to think a way to let users choose the version they want to install.

albertbup commented 6 years ago

@mingliking I have pushed a new branch containing tensorflow-gpu package in the requirements. As written in the updated README, you could now install my package as:

pip install git+git://github.com/albertbup/deep-belief-network.git@master_gpu

Hope it helps!