Engineer1999 / Double-Deep-Q-Learning-for-Resource-Allocation

Reproduce results of the research article "Deep Reinforcement Learning Based Resource Allocation for V2V Communications"
192 stars 55 forks source link

Attention:Questions about the four pictures! Please! #3

Open oyy524275383 opened 4 years ago

oyy524275383 commented 4 years ago

Traceback File "C:\Users\OYY\Desktop\Double-Deep-Q-Learning-for-Resource-Allocation-master的副本\agent.py", line 362, in play plt.savefig() File "E:\Users\OYY\AppData\Local\Programs\Python\Python37\lib\site-packages\matplotlib\pyplot.py", line 722, in savefig res = fig.savefig(*args, **kwargs) TypeError: savefig() missing 1 required positional argument: 'fname'. I don’t know how to deal with it,and if I run all the codes can I get all the four pictures?

Engineer1999 commented 4 years ago

You have to provide figure name (fname) to save in your local machine. Yes, you can get all the figures. However, the approach for couple of figure will be different. You have to run the code multiple times with different vehicle numbers, note the values manually and then use those values to plot those figures.

oyy524275383 commented 4 years ago

You have to provide figure name (fname) to save in your local machine. Yes, you can get all the figures. However, the approach for couple of figure will be different. You have to run the code multiple times with different vehicle numbers, note the values manually and then use those values to plot those figures.

Thanks,bro!I will try it later.If I get any questions about it,I will ask you again!

zenghaogithub commented 4 years ago

You have to provide figure name (fname) to save in your local machine. Yes, you can get all the figures. However, the approach for couple of figure will be different. You have to run the code multiple times with different vehicle numbers, note the values manually and then use those values to plot those figures.

Thanks,bro!I will try it later.If I get any questions about it,I will ask you again!

can you tell me how to provide figure name and solve this problem?I am also meet this problem?Thanks

Engineer1999 commented 4 years ago

In agent.py, line 362, plt.savefig(). Follow the documentation to name the figure, link is attached below. link:- https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.savefig.html

zenghaogithub commented 4 years ago

By naming the figure name,I get a picture that xlabel is ("Time left for V2V transmission (s)"),ylabel is ("Probability of power selection"),but I cannot get other pictures. As you mean, should I note the values manually and then use those values to plot those figures?

Engineer1999 commented 4 years ago

Your figure name should be changing. For example, let's say by running one program you get 100 results and I want to save them in my local machine. Rather than doing that manually, I can do as follow.

Example code:

for i in range(100):
     #Genrate vector for figure
     fig_name = 'figure_' + str(i)
     plt.plot('XXXXXXXXXXXX')
     plt.savefig(name)

Follow this method in the agent.py and you will get the figure in the given folder.

zenghaogithub commented 4 years ago

When I debug the function in agent.py,line 126, self.training = True for k in range(1): for i in range(len(self.env.vehicles)):
for j in range(3): state_old = self.get_state([i,j]) action = self.predict(state_old, self.step)

self.merge_action([i,j], action)

                    self.action_all_with_power_training[i, j, 0] = action % self.RB_number
                    self.action_all_with_power_training[i, j, 1] = int(np.floor(action/self.RB_number)) 
                    reward_train = self.env.act_for_training(self.action_all_with_power_training, [i,j]) 
                    state_new = self.get_state([i,j]) 
                    self.observe(state_old, state_new, reward_train, action)

the variable 'reward_train' accept the return of function (act_for_training) in environment.py, but ‘for j in range(3):’ means a vehicle has 3 neighbors(3 V2V links),each link has a reward(return t - (self.V2V_limit - time_left)/self.V2V_limit),the reward has already contained 3 parts:V2V rate ,V2I rate and latency penalty. I don't know whether one RB can only be used by a V2V link?If not,the V2I rate has been double-counted. In the code,there are 60 vehicles , 20 vehicles communicate by V2V, a vehicle has 3 V2V link, what's the quantitative relation between V2I link and V2V link? I'd like to plot a picture ,xlabel is step,ylabel is sum reward,what should I do?

zenghaogithub commented 4 years ago

sorry,I just want to ask in the simulation,how many V2I links and how many V2V links?How many RB does a V2I link use?How many RB does a V2V link use?

Engineer1999 commented 4 years ago

The number of RBs are constant. Our objective is to train an agent in such a way that it can incorporate the V2V link and V2I link in given RBs. Initially, RBs were only used by the V2I link and now we want to use RBs for the V2V link in such a way that it does not create any kind of interference in the V2I link. The RBs are resources and we are utilizing them as much as possible without creating any kind of interference in V2V and V2I link. I hope this answers your confusion. Also, for the plot, you are saying in previous comment, right now I am not sure how we can do that like I have to look into code again to give a satisfactory answer. I'll look into it and try to solve your query.

zenghaogithub commented 4 years ago

Thanks for your soon reply. I see,but I want to know that in the simulation,there are 20 RBs and 60 vehicles, in which 20 vehicles use V2V link to communicate with 3 neighbors,so there are 60 V2V links,That's for sure ,I think. I speculate every 3 V2V links reuse one RB, which is also occupied by one V2I link. However, how many V2I links? In the test phase, there are only 20 V2I links, so i am confused about it.

Engineer1999 commented 4 years ago

We have to accommodate the V2V link and V2I link in the given resource blocks. If you observe figures carefully, you can see we are increasing the number of vehicles which implies that the V2V link will also increase. We are using RL to allocate appropriate RB to the V2V link and V2I link without creating any kind of interference.

Judithcodes commented 3 years ago

I am trying to replicate Figure 1. image

From the code, does the mean of V2I rate refer to the Sum rate of V2I links? What is the equivalent of the "Probability of satisfied users?". How do you determine the time for each selection as stated in the paper?

So far, the code generates the following: Mean of V2I rate Percent Mean of fail percent