Closed charleswangyanan closed 7 months ago
@NoemiF99 You can:
- register info in the
step
function of your customized environment file. for example:{number_of_collisions: 10}
(10 is the supposed number of collisions).- register variable
_ep_num_colli
inomnisafe/adapter/onpolicy_adapter.py
then log it from info (since you have log it in the step 1) in the method_log_value
,_log_metrics
and_reset_log
. Just following what we do to log reward/cost.- Suppose you name the number of collisions key as
Metrics/EpColi
, you can make corresponding change in policy gradient.py to log it just like how we logMetrics/EpRet
.
Thank you so much for the support and advice you provided. They were very helpful and allowed me to input the data as I wanted. You have been very kind.
@tjruan The inclination of omnisafe's gaussian_learning_actor
to utilize Gaussian distribution, a common technique in handling continuous action space tasks in reinforcement learning algorithms, accounts for this. The ActionScale
wrapper solely focuses on scaling the mean of the Gaussian distribution into a specified field, but it does not ensure that actions sampled from this distribution strictly obey constraints. For example, with a mean of 0.1 in a Gaussian distribution, a sampled action could still potentially yield a -0.1. If you are keen on ensuring this specific action aligns with the action space prerequisites, you might consider executing additional clip operations on the action as suggested in the official documents of the gymnasium.
@NoemiF99 You can:
- register info in the
step
function of your customized environment file. for example:{number_of_collisions: 10}
(10 is the supposed number of collisions).- register variable
_ep_num_colli
inomnisafe/adapter/onpolicy_adapter.py
then log it from info (since you have log it in the step 1) in the method_log_value
,_log_metrics
and_reset_log
. Just following what we do to log reward/cost.- Suppose you name the number of collisions key as
Metrics/EpColi
, you can make corresponding change in policy gradient.py to log it just like how we logMetrics/EpRet
.
Hello, I would like to ask a question. After inserting the number of collisions in the _log_metrics, _log_value, and _log_reset functions in the onpolicy_adapter.py file and recording my data within the _init_log function of policy_gradient by saving the number of collisions like this: self._logger.register_key('Metrics/EpNumCollisions'). I don't understand why it calculates an average of the number of collisions instead of returning the integer value for each epoch. Which function should I modify in this case to ensure that the saved value is the integer number? Thank you very much in advance for your help
Set the steps_per_epoch
to the same as the episode length of your environment may help.
We need to implement decisions at two time scales, such as 1 time step decision and 4 time step decision making. So we need to establish two sets of state, action, reward functions. There is a coupling relationship between the variables of the two time scales, so it is more convenient to implement it in a Class environment.
How to set up two state spaces? Such as the following example, but this example is wrong.
from gym.spaces import Box, Dict
self._observation_space = Dict({
'obs1': Box(low=0, high=1, shape=(5,), dtype=np.float32),
'obs2': Box(low=0, high=1, shape=(12,), dtype=np.float32)
})`
Error is "AssertionError: Observation space must be Box". How to set up two state spaces? Thank you.
This issue pertains to multi-agent safety reinforcement learning, which is currently unsupported by OmniSafe. The following code base might be of assistance:
Issue with Video Saving During Training I am currently using your code and would like to bring to your attention an issue I am experiencing during training. Currently, the video saving message is displayed correctly in the terminal as follows:
################################################## Saving the replay video to ./runs/PCPO-{Custom1-v0}/seed-000-2024-01-20-16-11-47/video/epoch-100, and the result to ./runs/PCPO-{Custom1-v0}/seed-000-2024-01-20-16-11-47/video/epoch-100/result.txt. ################################################## However, despite this message, the video is not actually being saved in the designated folder. Instead, only the results are being saved, and I cannot identify the reason for this behavior.
I have checked that the folder structure is correct and there are no obvious errors, but the video is still not appearing. I would like to understand if there is any specific step that could be causing this issue or if there is something I could modify in the code to ensure the proper saving of the video.
Thank you in advance for your help and assistance. I am available to provide further details.
However, despite this message, the video is not actually being saved in the designated folder. Instead, only the results are being saved, and I cannot identify the reason for this behavior.
@NoemiF99, my team was having a similar issue when running a video-saving script from inside a Docker container. It seemed to be failing silently, as you described. We used the xvfb
utility to solve this: instead of running python3 my_file.py
, we ran xvfb-run -a python3 my_file.py
.
Closed due to no further questions. You're welcome to reopen it if you happen to have more issues.
Required prerequisites
Questions
Question1: I just want used our own environment, and find the omnisafe-main/tests/simple_env.py, so attempt to modify the simple_env into our own environment. However, when run the omnisafe-main/examples/train_policy.py, the simple_env couldn't be introduced?What shoule I use to train the simple_env.py?
Question2: I learn from issue #273#263#255 and the github README.md, find the following, could you give a specific environment example?It is the same with omnisafe-main/tests/simple_env? Thank you very much!