Closed Tangsuuu closed 2 years ago
Hi, no it should not give any kind of issue. In case you encountered a bug, please give me your configuration and the apk
that crashes.
It didn't crash. I my test case of BookyMcBookface.apk.
self.action_space = spaces.Box(low=numpy.array([0, 0, 0]),high=numpy.array([self.ACTION_SPACE, len(self.strings) - 1, 1]), dtype=numpy.int64)
self.ACTION_SPACE is 30. But after self.check_activity() in init(), self.action_space.high[0] = len(self.views) + self.shift is seted.
So self.action_space is changed to spaces.Box(low=numpy.array([0, 0, 0]),high=numpy.array([4, len(self.strings) - 1, 1]).
Then self.action_space is used to set the action_space in SAC model.
It will cause wrong ACTION_SPACE in SAC model. The right ACTION_SPACE should be 30 in your parameter.
It is normal to resize the action space during the exploration. It is resized to match the number of widgets that ARES can interact with. In fact, the number of widgets that an activity has may vary over time. However, the number also changes between activities.
Let us say that the ACTION_SPACE of dimension 30 is the upper limit of widgets that ARES can manage.
I may didn't understand the init() of SAC model. I just think if the SAC model is initialized with ACTION_SPACE = 4, it will not predict the action_number[0] more than 4. Will the upper limit of predicted action_number[0] in SAC model changed according to the resize the action space during the exploration?
If you look at line 173
in RL_application_env.py
I'm setting the action_space to the "maximum_dimension" of 30.
Then at line 178
, I call self.check_activity()
that actually resizes the action_space to the correct dimension.
At last self.check_activity()
is recalled each time that the GUI changes.
Thanks for your reply. I konw the ACTION_SPACE is changed with the GUI changes. But the size of prediction of SAC model didn't changed with action_space resizes in self.check_activity(). In my test, the SAC model is initialized with the ACTION_SPACE = 4 ( line 178 self.check_activity() ). Then self.check_activity() is recalled each time that the GUI changes than change the ACTION_SPACE in RL_application_env, but it don't change the ACTION_SPACE in SAC model. The prediction of the model is fixed to ACTION_SPACE = 4 in initialization.
I didn't talk the init() in RL_application_env.py. The init() in SAC (from stable_baselines3 import SAC) will fixed with ACTION_SPACE = 4.
Yes, we can not change the dimension of the output of the ML model (i.e., the action space of the ML model), the library does not allow this kind of modification. I only use the resized dimension to check if the action generated by the ML model goes out of bound (a non-valid action) or if it is valid for the application.
So self.check_activity()(at line 178) will set the wrong action space for the ML model ? because it change the self.action_space = spaces.Box() to the action space of the initial GUI of app and ML model initialized with the changed action_space.
The ML model checks the dimension only at the beginning, and then it will never use the action_space
variable anymore. This means that the output of the ML model is always in the range 0:30
.
But if you have 3 buttons in Activity Something.Main
, it is not useful to generate a value greater than 2 (i.e., 0, 1, 2).
So I'm just using the action_space
variable to save the actual action space dimension.
In my test, the ML model checks the dimension at the beginning and the action_space(30) is changed to 4 before ML model init(). The self.check_activity()(at line 178) in RL_application_env.py init() will work before ML model init(). It is right for Activity Something.Main, but it didn't work for other activities (which buttons are more than 4).
I'm sorry, now I got the point. I'll fix it asap.
Thank you so much for your patient responses.
I did a preliminary check and it seems to work, the model generates values between 0 and 30. Here is my console output. ` 2022-02-14 15:03:48.583 | DEBUG | rl_interaction.RL_application_env:init:127 - apps/Calculator.apk START
----> line 173 [30, 39, 1]
----> line 178 [4, 39, 1]
Starting training from zero Using cpu device Wrapping the env in a DummyVecEnv. 2022-02-14 15:03:59.531 | DEBUG | rl_interaction.RL_application_env:reset:316 - <--- EPISODE RESET ---> Action: [ 4. 12. 0.] Action: [ 3. 28. 0.] 2022-02-14 15:04:01.700 | DEBUG | rl_interaction.RL_application_env:step2:236 - action: android:id/button1 Activity: calculator.innovit.com.calculatrice.MainActivity 2022-02-14 15:04:14.632 | DEBUG | rl_interaction.RL_application_env:step2:236 - action: abgc Activity: calculator.innovit.com.calculatrice.MainActivity
--- > Action: [24. 23. 1.] 2022-02-14 15:04:25.761 | DEBUG | rl_interaction.RL_application_env:step2:236 - action: Open navigation drawer Activity: calculator.innovit.com.calculatrice.MainActivity Action: [30. 39. 0.] 2022-02-14 15:04:38.336 | DEBUG | rl_interaction.RL_application_env:step2:236 - action: calculator.innovit.com.calculatrice.MainActivity.android.widget.Button.0 Activity: calculator.innovit.com.calculatrice.MainActivity Action: [18. 11. 1.] Action: [8. 0. 1.] 2022-02-14 15:04:48.083 | DEBUG | rl_interaction.RL_application_env:step2:236 - action: calculator.innovit.com.calculatrice.MainActivity.android.widget.Button.9 Activity: calculator.innovit.com.calculatrice.MainActivity Action: [21. 27. 1.] `
As you can see , the first call sets the dimension at line 173 and 178 shrinks it down to 4. But when you print the action space just after the step()
function you get also numbers bigger than 4, so this means that the ML model has the maximum dim = 30.
However, I will investigate the problem in the next days and I will keep you posted.
Hi, I also imitated Fate and creat a virutal_env. The step() function is: if action_number > 10 and action_number <20: logger.warning(100.0) return self.observation, numpy.array([100.0]), numpy.array(False), {} else: logger.warning(-100.0) return self.observation, numpy.array([-100.0]), numpy.array(False), {} But the ML model don't learn to predict action_number between 10 and 20. Do you know why? It bothered me for a long time.
The complete code is: test.py from rl_interaction.utils.wrapper import TimeFeatureWrapper from RL_application_env_fate import RLApplicationEnvFate app = RLApplicationEnvFate() env = TimeFeatureWrapper(app) model = SAC(MlpPolicy, env, learning_rate=3e-4, learning_starts=2000) model.learn(total_timesteps=100000) print("model end!!!")
RL_application_env_fate.py from gym import Env import os import numpy from loguru import logger from gym import spaces class RLApplicationEnvFate(Env): def init(self):
self.log_dir = './fate_logs'
if not os.path.exists(self.log_dir):
os.mkdir(self.log_dir)
self.action_num_logger_id = logger.add(os.path.join(self.log_dir, 'action_num_logger.log'), format="{level} {message}",
filter=lambda record: record["level"].name == "WARNING")
self.action_space = spaces.Box(low=numpy.array([0]),
high=numpy.array([30]),
dtype=numpy.int64)
self.observation_space = spaces.Box(low=0, high=1, shape=(300,), dtype=numpy.int32)
self.observation = numpy.ones(300)
@logger.catch()
def step(self, action_number):
#action_number = action_number.astype(int)
logger.warning(action_number)
if action_number > 10 and action_number <20:
logger.warning(100.0)
return self.observation, numpy.array([100.0]), numpy.array(False), {}
else:
logger.warning(-100.0)
return self.observation, numpy.array([-100.0]), numpy.array(False), {}
def reset(self):
return self.observation
During the training phase, some actions are always taken randomly; to have deterministic behavior, you should complete a training phase with a very high number of timesteps, not 100000 like in your case.
Hi, I find self.check_activity() will change the size of self.action_space = spaces.Box() in RL_application_env.py init(). Will it cause wrong action_space size init in the model?