muzero-general Search Results

99 results
for muzero-general

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

werner-duvaud/muzero-general #5

Masking allowed actions at root node

Hi Werner, I've really enjoyed tinkering with the codebase as I learn all aspects of MuZero. I see in the MuZero paper they describe how they mask the policy logits to allowable moves in the root nod…

fred-drake updated 1 year ago
2
ray-project/ray #16807

Dashboard white screen gpu

![image](https://user-images.githubusercontent.com/78921582/124135808-294cbb00-da52-11eb-97b4-1bc20f088567.png) ### What is the problem? Cuda + nvidia drivers and I started to see this problem …

SheldonCurtiss updated 1 year ago
6
deepjavalibrary/djl #2210

memory leak and duration increase during training

## Description During training on GPU I experience - 4 MiB memory leak on GPU per epoch (looks constant) - duration increase about 1 min per epoch (looks linear) ### Expected Behavior no memory…

enpasos updated 1 year ago
27
timoklein/alphazero-gym #10

Should entropy bonus be also calculated during planning?

Recently, I finished reading this repo code. And I found that the entropy bonus of a state value from SAC is only added at the last output step. This routine let me can't help but thinking: If t…

dbsxdbsx updated 1 year ago
6
google-deepmind/open_spiel #920

Support for better game state visualizations

I have been using OpenSpiel for my project and I found the string representation of the states are helpful but do have some limitations. I have been building visualization tools to use with OpenSpiel …

uduse updated 1 year ago
5
openai/gym #3018

[Proposal] Turn enwik9 (Hutter Prize) into an RL agent game

### Proposal Include the Hutter Prize corpus ([enwik9](http://mattmahoney.net/dc/enwik9.zip)) as a "game" for the purpose of sample-efficient reinforcement language modeling. ### Motivation …

jabowery updated 2 years ago
2
werner-duvaud/muzero-general #66

Windows: num_gpus=0 but cuda.is_available() returns True

There seems to be a bug on Windows 10 with cuda devices. `torch.nn.DataParallel(model)` will move model parameters and buffers to the GPU even if `selfplay_device = 'cpu'`. If you move the model to cp…

yffbit updated 2 years ago
13
werner-duvaud/muzero-general #166

Would it possible to write go game and chess game program?

leqingli2000 updated 2 years ago
1
werner-duvaud/muzero-general #150

[3090 rtx] Very slow training on resnet with 1 block

I tried to train a game with resnet as my network. Was extremely slow on a computer with 5950x and 3090 rtx. (like 1 step per 2-3 seconds). I then tried to decrease the number of resblocks to 1. It he…

HadiSDev updated 2 years ago
5
werner-duvaud/muzero-general #138

Reason for not re-executing MCTS and updating policy "target…

As mentioned in the MuZero paper, revisiting past time steps to re-execute the MCTS and update policy 'targets' (child_visits) can help improve sample efficiency. Is there a reason (other than computa…

FXDevailly updated 2 years ago
9

上一页 1...2 3 4 5 6 7 8...10 下一页

99 results for muzero-general

99 results
for muzero-general