Open drallensmith opened 7 years ago
The SARSA experiments were run by collaborators so unfortunately I don't have any the information about the weights of SARSA or even the original weights. However, the code is included in the examples so perhaps you could re-train your own.
Are you referring to fractured domain in the sense of different states requiring different actions?
Very close-together states requiring significantly different actions; the sort of thing that multi-modular-NEAT and RBF-NEAT are meant to help handle.
I think that it will depend on the HFO scenario. In a 1v0 with low level states & actions, I personally think similar states will have similar actions. However, with more players I think it's possible to have a much more fractured decision space.
I agree, although of course it depends on what one considers "similar" (only "kickable" gives a much different result between a ball just within kickable distance vs just outside, for instance). (I found an interesting student paper that found that with keepaway - HyperNEAT doing better with minimal numbers of players, but NEAT doing better than HyperNEAT with more players involved.)
One reason I'm interested in using high-level actions to start, however, despite that they almost certainly introduce more fractures in the space, is for a fair match vs random as a feasible-infeasible test.
Is there any information available about the SARSA weights discussed in the main HFO paper? I'm not necessarily asking if the weights themselves are available, but more any statistical information that would help determine the degree to which HFO is a "fractured" learning domain. (I'm pretty certain it is to some degree, given that keepaway soccer is known to be one, but would like confirmation.)