Closed KeeratKG closed 4 years ago
No, you should not turn the spaces into Numpy arrays. This ...
self.observation_space = np.array(spaces.Box(low= np.zeros((s, prop)), high = np.full((s, prop), float('inf')), shape = (s, prop), dtype = np.float32))
#actions are vectors of the form [n1, n2, n3,...nk, r] for k states and r reserved amount of drug
self.action_space = np.array(spaces.Box(low = np.zeros((s+1, ), dtype = int), high = np.array([100]*(s+1)), shape = (s + 1, ), dtype = np.uint8))
... should be something like this (I don't know if other stuff is correct)
self.observation_space = spaces.Box(low= np.zeros((s, prop)), high = np.full((s, prop), float('inf')), shape = (s, prop), dtype = np.float32)
self.action_space = spaces.Box(low = np.zeros((s+1, ), dtype = int), high = np.array([100]*(s+1)), shape = (s + 1, ), dtype = np.uint8)
Note that we do not provide tech support outside stable-baselines issues, so if this fixes the error you may close the issue.
@Miffyli would the .Box()
method used alone, without passing it intonumpy.array()
still allow me to represent my observation_space and action_space as arrays?
Because without passing them into numpy.array()
I get an error saying that the observation/action_space are unsubscriptable.
Yes, spaces.Box
alone should work, it should not be wrapped into any arrays or lists.
I see the issue though: You are changing the spaces in reset
. You can not change spaces after they are defined in __init__
.
@Miffyli the purpose of defining the starting values of the spaces in reset
is merely to initialise the value of the spaces from where the exploration-exploitation should start operating and the model begins to train.
umm I am not sure if that counts as changing the spaces...?
That is changing the spaces, you should not assign anything to observation/action_space after defined initially. reset
should return the initial values.
These are questions that are outside stable-baselines and well documented in docs and in OpenAI Gym. I am closing this issue.
Describe the bug Hi all, I am using the stable-baselines for a policy optimisation program pertaining to a drug distribution problem. I have made a custom environment following the gym interface using the guide given at [https://colab.research.google.com/github/araffin/rl-tutorial-jnrr19/blob/master/5_custom_gym_env.ipynb#scrollTo=1CcUVatq-P0l] and tried to validate it using the
check_env()
method. I am unable to understand and fix the error described below.Code example This is the code I made:
The error trace is as follows:
System Info Describe the characteristic of your environment:
%tensorflow_version 1.x !pip install stable-baselines[mpi]==2.10.0
gives:
TensorFlow 1.x selected.
Additional context I am not entirely sure here, but the problem may stem from the fact that I wanted my
observation_space
andaction_space
to be arrays and so converted the Box type by passing them intonumpy.array()
method. I am not sure if that's the right way to do it, and it'll be great if someone could clarify this as well!