About the testing result

ydlu commented 3 years ago

Hi，I'm very interested in your project and try to test the agent according to your README. And I encounter a few confusing problems. 1、After I run the make_submission_file.py directly using the folder example_submissions/submission_submitted , I got a different result from yours like this score 5.167464 duration 94.482203 .I didn't change anything in the project. So I don't understand why this happend ?

2、In the example_submissions\submission_submitted\agent.py , the hidden layers parameters of created agent are different from the two trained agents’ . It also run successfully. I don't understand why?

self.hidden1, self.hidden2, self.hidden3 = 600, 400, 400 self.hidden1, self.hidden2, self.hidden3 = 1000,1000,1000

Meanwhile, I didn't find the file 'python Train_Agent_discrete.py' in the project. I trained the two agents separately. Please help me. Thanks a lot. Best wishes.

ZM-Learn commented 3 years ago

Hi buddy, are you using the testing data set or environment of L2RPN NEURIPS track 1?

The agent was trained by L2RPN WCCI data set. I notice it can also be used for L2RPN NEURIPS track 1 and the score is similar to yours, but the training data set is different. If you are using L2RPN NEURIPS testing data, you may refer to my new uploaded folder example_submissions/submission_NEURIPS_1_1.
One created agent is defined by "self.hidden1, self.hidden2, self.hidden3 = 600, 400, 400", and another agent is defined by "self.hidden1, self.hidden2, self.hidden3 = 1000,1000,1000". The two agents are trained separately with "Train_Agent_discrete_strategy1.py" and "Train_Agent_discrete_strategy2.py". You may check the layers defined in the two programs and manually change "self.hidden1, self.hidden2, self.hidden3 = 1000, 1000, 1000" into "self.hidden1, self.hidden2, self.hidden3 = 600, 400, 400" and replace the file "pypow_wcci_a3c_actor.h5". During the training process, both agents are tuned with multiple training program, I can't confidently tell which part works better.

Best wishes.

ydlu commented 3 years ago

Hi buddy. Thanks for your reply and that's really important to me！

To be honest, I'm not very sure I'm using which testing data and environment. Because I looked through the codes all day and didn't figure out completely the way to set testing environment. I just replace the data of l2rpn_data folder with some the data from l2rpn_wcci_2020 and run make_submission_file.py. The results appear to be better than before. I don't know whether that works.

If you use the L2RPN NEURIPS testing data, I think you may need to do some changes in the utils folder so as to specify testing data or environment to test your agent ? I also don't understand how you set the testing envrionment ？

Maybe I didn't clarify my questions clearly. Besh wishes!

微信图片_20201123165456

ZM-Learn commented 3 years ago

Hi, I don't know the details of the data set. My suggestion is to create a new virtual environment and install the packages of L2RPN WCCI. For example, the Grid2Op==0.9.1 for WCCI, but 1.3.1 for NEURIPS. Then, the program shall run correctly. The guide to set the environment can be found at this link: https://competitions.codalab.org/competitions/24902#learn_the_details

If you are using the environment of NEURIPS, there might be some differences but idk why.

ydlu commented 3 years ago

Hi, sorry to bother you again. I still have a question to ask you. In your method Data_structure_process.py , I see that you choose the actions related to substation of number 16 according their simulated rewards. You simulate the reward of each possible action at the first step of scenario and decide whether to pick it according its reward. I'm curious about the reasons ? Is it reasonable ？I'm also thinking about how to reduce action space reasonably.

ZM-Learn commented 3 years ago

Hi, Actually, I made an agent try all possible actions at every step of the game. However, due to the computation burden, I did not wait for the results and just randomly choose some actions and proceed to RL. Later, there is a lightsim2grid package, which may accelerate the simulation try-out process. It is not reasonable to only use my randomly selected actions (you can also check my slides, sometimes all these actions fail). I only did it for a fast start. You can check my new uploaded file "Agent_Try_action_array.rar" to see if you can choose some more reasonable actions if you have time.

ZM-Learn / L2RPN_WCCI_a_Solution

About the testing result #1