datamllab / rlcard

Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
http://www.rlcard.org
MIT License
2.87k stars 619 forks source link

How to replace the DMC operating device? #289

Open JocyeI opened 1 year ago

JocyeI commented 1 year ago

I have replaced the DMC running device with CUDA, and the network output is all 0. The output of this action is more like a random selection action. May I ask how to avoid the DMC network output result being 0 when I use CUDA training

JocyeI commented 1 year ago

tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], device='cuda:0', grad_fn=)

JocyeI commented 1 year ago

@aetheryang

daochenzha commented 1 year ago

@Walhalla-Summary That is weird. How did you change it to CUDA?

JocyeI commented 1 year ago

That's not the point. When I use cuda training, all the predicted outputs of the neural network are 0 instead of floating point numbers, which makes me feel strange. Is there any way to solve this problem @daochenzha

daochenzha commented 1 year ago

@Walhalla-Summary How did you change it to CUDA training? More contexts are needed to reproduce the error