muzero-general Search Results

99 results
for muzero-general

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

werner-duvaud/muzero-general #24

Training on ended games

https://github.com/werner-duvaud/muzero-general/blob/4d541626a2d1ace2e3bdf30d25d9a843e4cb613c/replay_buffer.py#L79 Shouldn't this be: `position_probs = numpy.array(game_history.priorities[:-1]) / …

manuel-delverme updated 4 years ago
1
werner-duvaud/muzero-general #21

Wrong initial reward

https://github.com/werner-duvaud/muzero-general/blob/ecca75c8d5893048b0acc6c5897a504c6334b871/models.py#L137 Since you are predicting a distributions, the first reward should be scalar_to_support(0…

manuel-delverme updated 4 years ago
3
werner-duvaud/muzero-general #42

Priority Sampling Formula

In Appendix G of the muzero paper, they define the priority of a sample as p_i = | nu_i - z_i |, and write "nu is the search value and z the observed n-step return." (I'll use "nu" in place of ν for …

nemtiax updated 4 years ago
1
werner-duvaud/muzero-general #18

Tensorflow support

Hi, I would like to ask if tensorflow will be supported also as per the initial README (https://github.com/werner-duvaud/muzero-general/commit/d8388353cd37242efcdb7ff36680fc7059ecff6c#diff-04c6e90f…

jl1990 updated 4 years ago
2
werner-duvaud/muzero-general #16

wrong normalization for encoded state in `MuZeroFullyConnect…

At this line (https://github.com/werner-duvaud/muzero-general/blob/master/models.py#L125), it should be `next_encoded_state - min_next_encoded_state`

xuxiyang1993 updated 4 years ago
1
werner-duvaud/muzero-general #4

ray not support windows，so muzero-general can not run on win…

please support windows，thanks.

coder-free updated 4 years ago
4
werner-duvaud/muzero-general #53

Breakout

Thank you very much for a comprehensive implementation. I ran Breakout with the current configuration, except changing the actors from 350 to 4 since I ran into memory problems with Ray. I am usin…

pdutoit2011 updated 4 years ago
5
werner-duvaud/muzero-general #19

Determining who is next to play inside the MCST

In https://github.com/werner-duvaud/muzero-general/blob/283e3538485be0e36ef77f402249666f735f5278/self_play.py#L262 you essentially assume actions are taken by players in alternating order for two-play…

fidel-schaposnik updated 4 years ago
2
werner-duvaud/muzero-general #6

value/reward transform issue

I think the way you transform value/reward is a little mismatch with the original paper at this line (https://github.com/werner-duvaud/muzero-general/blob/fe791e8651645ea05f5b582157b4892588ee56ca/trai…

xuxiyang1993 updated 4 years ago
3

上一页 1...4 5 6 7 8 9 10...10 下一页

99 results for muzero-general

99 results
for muzero-general