How could I use multi-gpu

VisualComputingInstitute / triplet-reid

Code for reproducing the results of our "In Defense of the Triplet Loss for Person Re-Identification" paper.

https://arxiv.org/abs/1703.07737

MIT License

764 stars 216 forks source link

How could I use multi-gpu #58

Open CoinCheung opened 6 years ago

CoinCheung commented 6 years ago

It seems that the model only runs on a single gpu no matter how many gpus are available. If the space the model takes up is more than the volume of one gpu, there would be oom error. I can train the model on a single gpu with default configuration, but once I double the batch size and use two gpus, there is oom errors. How could I use multi-gpu in this case ?

Pandoro commented 6 years ago

This sadly doesn't work out of the box in Tensorflow, you will need to adjust the code quite a bit for this to work. For example you could start by taking a look at this example. This is nothing we are planning on doing though.

chris20181220 commented 5 years ago

It seems that the model only runs on a single gpu no matter how many gpus are available. If the space the model takes up is more than the volume of one gpu, there would be oom error. I can train the model on a single gpu with default configuration, but once I double the batch size and use two gpus, there is oom errors. How could I use multi-gpu in this case ?

I have the same problem, did you solve it?

CoinCheung commented 5 years ago

@chris20181220 yes, I reimplement it with pytorch, and my implementation supports multi-gpu working

chris20181220 commented 5 years ago

@chris20181220 yes, I reimplement it with pytorch, and my implementation supports multi-gpu working

@CoinCheung if I need to use tf, do u know how to fix?

CoinCheung commented 5 years ago

@chris20181220 As the author said, it will be quite tedious and many code should be modified, I do not think I can do it now. Sorry I cannot help.

chris20181220 commented 5 years ago

@chris20181220 As the author said, it will be quite tedious and many code should be modified, I do not think I can do it now. Sorry I cannot help.

@CoinCheung OK thank you all the same, i try to modify

lucasb-eyer commented 5 years ago

You should also be aware that there comes the question of how to do the triplet mining in the batch: mine on each GPU's batch independently, or gather all batch outputs to one fixed GPU and mine in the large complete batch there. There are trade-offs and it's not clear what is best.

Note: I have linked your re-implementation in our README as it could be useful for others. Let me know if you don't want this.

Pandoro commented 5 years ago

Also keep in mind what you do with the batch normalization. When you split the batch, it could pay off to specifically split the batch to make two P×K/2 batches, instead of two P/2×K batches, unless you specifically sync your batch normalization across GPUs.

On Thu, Dec 20, 2018, 20:58 Lucas Beyer notifications@github.com wrote:

You should also be aware that there comes the question of how to do the triplet mining in the batch: mine on each GPU's batch independently, or gather all batch outputs to one fixed GPU and mine in the large complete batch there. There are trade-offs and it's not clear what is best.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/VisualComputingInstitute/triplet-reid/issues/58#issuecomment-449118265, or mute the thread https://github.com/notifications/unsubscribe-auth/AByRzXjXByl0VnGVqNr9WFP1PUIm4ATAks5u6-vsgaJpZM4WY3mf .