Closed pablogranolabar closed 2 years ago
I would say that the uses for this library is pretty generic and should yield at least some useful output given your discrete input in the substrate. Have you tried? And what were the outcome in your three integers? I guess you're forcing the output/input to be integers by some kind of rounding?
I've got a toy Gym environment I've been working on, simple baccarat card game where both the observation and action spaces are discrete valued (int wagertype, int wageramount for actions, and observation is six integer valued cards). Agent views the previous hand of six cards then makes a bet for either player or banker and with corresponding amount. I don't want to retrofit an existing continuous valued environment for this as the agent can't make fractional bets. I'm still in the process of finishing the game play with this custom gym environment, can register it etc. Is that the best path forward? I've been experimenting with some of the other discrete valued environments with Levy's ES-HyperNEAT library but it ends up casting the discrete values to floats. So I've been building this around Pureples hopefully
Thanks for the help!
That sounds like the right way to go, yeah - but I'm still curious about the results. Did you open this issue because you're lacking meaningful results or just as a question?
Hola @ukuleleplayer !
I am back on this project again, using Levy's neat-gym which integrates PUREPLES.
So for example in neat-gym's cartpole config, the substrate is defined as:
[Substrate]
# For (ES-)HyperNEAT
input = [(-1. +(2.*i/3.), -1.) for i in range(4)]
hidden = [[(-0.5, 0.5), (0.5, 0.5)], [(-0.5, -0.5), (0.5, -0.5)]]
output = [(-1., 1.), (1., 1.)]
function = sigmoid
With the outputs being continuous and corresponding to the discrete action space from cartpole:
Actions:
Type: Discrete(2)
Num Action
0 Push cart to the left
1 Push cart to the right
So in this example it's pretty simple, he's just using the min/max to define a Boolean. But how to abstract this to multiple discrete actions, such as wager type (player / banker, discrete(2)) as well as the wager amount (discrete(1000))? Am I constrained to using binary outputs with masking or something of that nature, in order to cast the environment's discrete action (and observation) spaces to continuous?
Hi again!
I mean, yeah, you just gotta use the outputted floats as if they were integers/Boolean - meaning flooring or ceilinging the output. The network don't care about its output types, it simply adjusts to your fitness functions, e.g. predicting 1, 0 or any other discrete value you wish for. How Levy's neat-gym project is utilizing PUREPLES I'm unaware of, but what you're trying should definitely work :)
Hi!
Very cool project, thanks for making it available. I have a toy project I am working on with Gym for function approximation, and which is a discrete-valued observation space consisting of 12 integers; action space is also discrete-valued, three integers used to determine the correct agent action based on the sequence of 12 integers.
So does pureples support discrete observation and action spaces, and would the cartpole experiment make for a good starting point for this?
Thanks in advance!