Question about training and testing

hongzimao / pensieve

Neural Adaptive Video Streaming with Pensieve (SIGCOMM '17)

http://web.mit.edu/pensieve/

MIT License

516 stars 279 forks source link

Question about training and testing #116

Open InanisV opened 3 years ago

InanisV commented 3 years ago

Hi Zimao, I am learning about your work. When I run python multi_agent.py, the program is going to add the epochs very quickly and keep printing "Testing model restored." instead of entering the training stage and exploiting the GPU. It seems that the overloaded tf.train is not used. Should I change the code of compute_gradients in a3c.py? I am a bit confused now. Is there something wrong with my understanding of your code?

hongzimao commented 3 years ago

The code periodically tests the current model with a hold-up testing dataset. https://github.com/hongzimao/pensieve/blob/master/sim/multi_agent.py#L192 the first epoch also enters the testing loop. You can comment this out if you are eager to go in the training loop. GPU is not really required for this code I think.

InanisV commented 3 years ago

Thank you for your explanation! :) ~~So do you mean that this project does not need GPU? All works are done by CPU alone?~~ Uh, I understand after I check the log file, the code only exploits CPU. Let me spend more time on the code to find out how it works.

InanisV commented 3 years ago

Excuse me, I have another question. When I finish training and try to plot my test results, the program meets list index out of range while running plot_results.py. The error says bit_rate.append(VIDEO_BIT_RATE[int(parse[6])]) causes the problem. I find out that the 7th place in a line of a file is reward according to bb.py.

log_file.write(str(time_stamp / M_IN_K) + '\t' +
                       str(VIDEO_BIT_RATE[bit_rate]) + '\t' +
                       str(buffer_size) + '\t' +
                       str(rebuf) + '\t' +
                       str(video_chunk_size) + '\t' +
                       str(delay) + '\t' +
                       str(reward) + '\n')

Is there anything wrong here? It seems that the line of code should be bit_rate.append(int(parse[1])) since VIDEO_BIT_RATE is in the second place.

hongzimao commented 3 years ago

It looks like so. Can you double check the source of log in VIDEO_BIT_RATE is the same format as bb.py? When you make the change and plot the result, can you cross check with our paper and other posts in issues to make sure it looks sane? Thanks.

InanisV commented 3 years ago

Okay, let me have a check. By the way, what does bw mean in plot_results.py? https://github.com/hongzimao/pensieve/blob/1120bb173958dc9bc9f2ebff1a8fe688b6f4e93c/test/plot_results.py#L41

time_ms = []
bit_rate = []
buff = []
bw = []
reward = []

hongzimao commented 3 years ago

bandwidth

InanisV commented 3 years ago

Sorry, It's my mistake. The log file generated by bb.py and dp.py is different. I misunderstood the meaning of the variable SIM_DP and changed it into sim_bb. When I checked the code in plot_results.py, I found there is a separate branch that is used to dealing with log files made by dp.py, where bit_rate.append(VIDEO_BIT_RATE[int(parse[6])]) appends the bitrate correctly. Finally, I successfully plotting the result. Bandwidth Thank you very much! I am appreciated for your reply.