Expect the observations and rewards of the forked environment the same as those of the original environment when the same action is applied, but they could be different.
To Reproduce
Run the tests below to reproduce
import gym
import random
import numpy as np
import compiler_gym
import copy
def test1():
# Expect the observations and rewards of the forked env the same as those of the original env,
# but they could be different
print("------test1------")
benchmark = "benchmark://opencv-v0/108" #"benchmark://cbench-v1/qsort"
with gym.make("llvm-autophase-ic-v0", benchmark=benchmark) as env:
env.reset()
features = ["Programl", "IrInstructionCountOz", "IrInstructionCount", "IrSha1"]
obs_space = [ env.observation.spaces[feature_name] for feature_name in features ]
rewards_space = [
env.reward.spaces["IrInstructionCountOz"],
env.reward.spaces["IrInstructionCountO3"],
env.reward.spaces["IrInstructionCount"],
]
observations, rewards, done, info = env.step(action=[], observation_spaces=obs_space, reward_spaces=rewards_space)
assert info['action_had_no_effect']
i = 0
while not done:
forked_env = env.fork()
action = env.action_space.sample()
observations, rewards, done, info = env.step(action=action, observation_spaces=obs_space, reward_spaces=rewards_space)
print(observations[1:3], rewards, action)
obs_space2 = [ forked_env.observation.spaces[feature_name] for feature_name in features ]
rewards_space2 = [
forked_env.reward.spaces["IrInstructionCountOz"],
forked_env.reward.spaces["IrInstructionCountO3"],
forked_env.reward.spaces["IrInstructionCount"],
]
observations2, rewards2, done2, info2 = forked_env.step(action=action, observation_spaces=obs_space2, reward_spaces=rewards_space2)
print(observations2[1:3], rewards2, action)
if tuple(observations[1:3] + rewards) != tuple(observations2[1:3] + rewards2):
print("Error ==========")
return
i += 1
if i > 10000:
break
if __name__ == "__main__":
test1()
Expected behavior
The forked environment should produce the same observations/rewards as the original environment when the same action is applied.
Environment
Please fill in this checklist:
CompilerGym: 0.2.4
How you installed CompilerGym (pip, source): following CompilerGymExperiments
OS: Ubuntu 20.04.2 LTS
Python version: 3.8.13
Build command you used (if compiling from source): following CompilerGymExperiments
GCC/clang version (if compiling from source):
Versions of any other relevant libraries:
You may use the
environment collection script
to generate most of this information. You can get the script and run it with:
wget https://raw.githubusercontent.com/facebookresearch/CompilerGym/stable/build_tools/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py
🐛 Bug
Expect the observations and rewards of the forked environment the same as those of the original environment when the same action is applied, but they could be different.
To Reproduce
Run the tests below to reproduce
Expected behavior
The forked environment should produce the same observations/rewards as the original environment when the same action is applied.
Environment
Please fill in this checklist:
You may use the environment collection script to generate most of this information. You can get the script and run it with:
Additional context