hongzimao / pensieve

Neural Adaptive Video Streaming with Pensieve (SIGCOMM '17)
http://web.mit.edu/pensieve/
MIT License
518 stars 281 forks source link

Some questions about run run-exp/run_all_traces. #13

Closed tylercross closed 6 years ago

tylercross commented 6 years ago

Of course, I can run ./testing/rl_no_training.py and plot a lot of pictures(also has some problem at other algrithms). but when executing ./run_exp/run_all_traces.py according to readme,

It stays in this line: run_all_traces.py (67): proc_RL = subprocess.Popen (command_RL, stdout = subprocess.PIPE, shell = True)

It seems that this subprocess have some error,so I check called run_traces.py ,and I just run BB algorithm using a new py file.The error is that, it would stay in py(29): proc_BB = subprocess.Popen(command_BB, stdout=subprocess.PIPE, shell=True)#create a new subporcess,

tylercross commented 6 years ago

Thank you very much. I'll try it at once. And I've already submitted my questions on GitHub.------------------ Original ------------------From: Hongzi Mao notifications@github.comDate: 周五,11月 24,2017 11:50 上午To: hongzimao/pensieve pensieve@noreply.github.comCc: tylercross fanwenjin@tju.edu.cn, Author author@noreply.github.comSubject: Re: [hongzimao/pensieve] Some questions about runrun-exp/run_all_traces. (#13)Hope the solution in this issue can help you: #10

—You are receiving this because you authored the thread.Reply to this email directly, view it on GitHub, or mute the thread.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/hongzimao/pensieve","title":"hongzimao/pensieve","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/hongzimao/pensieve"}},"updates":{"snippets":[{"icon":"PERSON","message":"@hongzimao in #13: Hope the solution in this issue can help you: https://github.com/hongzimao/pensieve/issues/10"}],"action":{"name":"View Issue","url":"https://github.com/hongzimao/pensieve/issues/13#issuecomment-346737669"}}}

hongzimao commented 6 years ago

Please try to set up a testing environment the same as ours (e.g., Ubuntu 16.04, Tensorflow v1.1.0, TFLearn v0.3.1 and Selenium v2.39.0) for the purpose of reproducing the results. Specifically, get started with python setup.py in main repo directory. Also, this issue may provide some hints for solving your problem: https://github.com/hongzimao/pensieve/issues/10. Hope it helps.

tylercross commented 6 years ago

Sorry, the problem is still not solved. I reinstalled Ubuntu and the same version of tensorflow, tflearn, selenium, and ran setup.py again. My purpose is to repeat your simulation, so I use the training model you provided directly. I can execute test / rl-no-training and plot a series of pictures of rl with plot-results.py. But when run-exp / run-all-traces is executed, the problem is still the same. My question is: Do you still need other operations to support run-all_traces? test folder to draw all the results of all algorithm results run-exp operation is the premise? (In the test part I can only guarantee that rl can draw the result, the other more or less have some minor problems, so temporarily skip)

hongzimao commented 6 years ago

Which line of code exactly was this that throw you an error? Or does it get stuck somewhere indefinitely? Please narrow down the problem to the most specific that you can. Thanks!

tylercross commented 6 years ago

When using command python -m trace --trace run_all_traces.py | tee 4.txt, It stucks at run_all_traces.py(64): proc_RL = subprocess.Popen(command_RL, stdout=subprocess.PIPE, shell=True) run_all_traces.py(64): proc_RL = subprocess.Popen(command_RL, stdout=subprocess.PIPE, shell=True) --- modulename: subprocess, funcname: __init__ subprocess.py(657): _cleanup() --- modulename: subprocess, funcname: _cleanup subprocess.py(459): for inst in _active[:]: subprocess.py(659): if not isinstance(bufsize, (int, long)): subprocess.py(662): if mswindows: subprocess.py(672): if startupinfo is not None: subprocess.py(675): if creationflags != 0: subprocess.py(679): self.stdin = None subprocess.py(680): self.stdout = None subprocess.py(681): self.stderr = None subprocess.py(682): self.pid = None subprocess.py(683): self.returncode = None subprocess.py(684): self.universal_newlines = universal_newlines subprocess.py(703): errread, errwrite), to_close = self._get_handles(stdin, stdout, stderr) --- modulename: subprocess, funcname: _get_handles subprocess.py(1112): to_close = set() subprocess.py(1113): p2cread, p2cwrite = None, None subprocess.py(1114): c2pread, c2pwrite = None, None subprocess.py(1115): errread, errwrite = None, None subprocess.py(1117): if stdin is None: subprocess.py(1118): pass subprocess.py(1128): if stdout is None: subprocess.py(1130): elif stdout == PIPE: subprocess.py(1131): c2pread, c2pwrite = self.pipe_cloexec() --- modulename: subprocess, funcname: pipe_cloexec subprocess.py(1179): r, w = os.pipe() subprocess.py(1180): self._set_cloexec_flag(r) --- modulename: subprocess, funcname: _set_cloexec_flag subprocess.py(1161): try: subprocess.py(1162): cloexec_flag = fcntl.FD_CLOEXEC subprocess.py(1166): old = fcntl.fcntl(fd, fcntl.F_GETFD) subprocess.py(1167): if cloexec: subprocess.py(1168): fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) subprocess.py(1181): self._set_cloexec_flag(w) --- modulename: subprocess, funcname: _set_cloexec_flag subprocess.py(1161): try: subprocess.py(1162): cloexec_flag = fcntl.FD_CLOEXEC subprocess.py(1166): old = fcntl.fcntl(fd, fcntl.F_GETFD) subprocess.py(1167): if cloexec: subprocess.py(1168): fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) subprocess.py(1182): return r, w subprocess.py(1132): to_close.update((c2pread, c2pwrite)) subprocess.py(1139): if stderr is None: subprocess.py(1140): pass subprocess.py(1155): return (p2cread, p2cwrite, subprocess.py(1156): c2pread, c2pwrite, subprocess.py(1157): errread, errwrite), to_close subprocess.py(705): try: subprocess.py(706): self._execute_child(args, executable, preexec_fn, close_fds, subprocess.py(707): cwd, env, universal_newlines, subprocess.py(708): startupinfo, creationflags, shell, to_close, subprocess.py(709): p2cread, p2cwrite, subprocess.py(710): c2pread, c2pwrite, subprocess.py(711): errread, errwrite) --- modulename: subprocess, funcname: _execute_child subprocess.py(1207): if isinstance(args, types.StringTypes): subprocess.py(1208): args = [args] subprocess.py(1212): if shell: subprocess.py(1213): args = ["/bin/sh", "-c"] + args subprocess.py(1214): if executable: subprocess.py(1217): if executable is None: subprocess.py(1218): executable = args[0] subprocess.py(1220): def _close_in_parent(fd): subprocess.py(1227): errpipe_read, errpipe_write = self.pipe_cloexec() --- modulename: subprocess, funcname: pipe_cloexec subprocess.py(1179): r, w = os.pipe() subprocess.py(1180): self._set_cloexec_flag(r) --- modulename: subprocess, funcname: _set_cloexec_flag subprocess.py(1161): try: subprocess.py(1162): cloexec_flag = fcntl.FD_CLOEXEC subprocess.py(1166): old = fcntl.fcntl(fd, fcntl.F_GETFD) subprocess.py(1167): if cloexec: subprocess.py(1168): fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) subprocess.py(1181): self._set_cloexec_flag(w) --- modulename: subprocess, funcname: _set_cloexec_flag subprocess.py(1161): try: subprocess.py(1162): cloexec_flag = fcntl.FD_CLOEXEC subprocess.py(1166): old = fcntl.fcntl(fd, fcntl.F_GETFD) subprocess.py(1167): if cloexec: subprocess.py(1168): fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) subprocess.py(1182): return r, w subprocess.py(1228): try: subprocess.py(1229): try: subprocess.py(1230): gc_was_enabled = gc.isenabled() subprocess.py(1233): gc.disable() subprocess.py(1234): try: subprocess.py(1235): self.pid = os.fork() subprocess.py(1240): self._child_created = True subprocess.py(1241): if self.pid == 0: subprocess.py(1243): try: subprocess.py(1245): if p2cwrite is not None: subprocess.py(1247): if c2pread is not None: subprocess.py(1248): os.close(c2pread) subprocess.py(1249): if errread is not None: subprocess.py(1251): os.close(errpipe_read) subprocess.py(1256): if c2pwrite == 0: subprocess.py(1258): if errwrite == 0 or errwrite == 1: subprocess.py(1262): def _dup2(a, b): subprocess.py(1270): _dup2(p2cread, 0) --- modulename: subprocess, funcname: _dup2 subprocess.py(1266): if a == b: subproexec_flag subprocess.py(1161): try: subprocess.py(1162): cloexec_flag = fcntl.FD_CLOEXEC subprocess.py(1166): old = fcntl.fcntl(fd, fcntl.F_GETFD) subprocess.py(1167): if cloexec: subprocess.py(1168): fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) subprocess.py(1182): return r, w subprocess.py(1132): to_close.update((c2pread, c2pwrite)) subprocess.py(1139): if stderr is None: subprocess.py(1140): pass subprocess.py(1155): return (p2cread, p2cwrite, subprocess.py(1156): c2pread, c2pwrite, subprocess.py(1157): errread, errwrite), to_close subprocess.py(705): try: subprocess.py(706): self._execute_child(args, executable, preexec_fn, close_fds, subprocess.py(707): cwd, env, universal_newlines, subprocess.py(708): startupinfo, creationflags, shell, to_close, subprocess.py(709): p2cread, p2cwrite, subprocess.py(710): c2pread, c2pwrite, subprocess.py(711): errread, errwrite) --- modulename: subprocess, funcname: _execute_child subprocess.py(1207): if isinstance(args, types.StringTypes): subprocess.py(1208): args = [args] subprocess.py(1212): if shell: subprocess.py(1213): args = ["/bin/sh", "-c"] + args subprocess.py(1214): if executable: subprocess.py(1217): if executable is None: subprocess.py(1218): executable = args[0] subprocess.py(1220): def _close_in_parent(fd): subprocess.py(1227): errpipe_read, errpipe_write = self.pipe_cloexec() --- modulename: subprocess, funcname: pipe_cloexec subprocess.py(1179): r, w = os.pipe() subprocess.py(1180): self._set_cloexec_flag(r) --- modulename: subprocess, funcname: _set_cloexec_flag subprocess.py(1161): try: subprocess.py(1162): cloexec_flag = fcntl.FD_CLOEXEC subprocess.py(1166): old = fcntl.fcntl(fd, fcntl.F_GETFD) subprocess.py(1167): if cloexec: subprocess.py(1168): fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) subprocess.py(1181): self._set_cloexec_flag(w) --- modulename: subprocess, funcname: _set_cloexec_flag subprocess.py(1161): try: subprocess.py(1162): cloexec_flag = fcntl.FD_CLOEXEC subprocess.py(1166): old = fcntl.fcntl(fd, fcntl.F_GETFD) subprocess.py(1167): if cloexec: subprocess.py(1168): fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) subprocess.py(1182): return r, w subprocess.py(1228): try: subprocess.py(1229): try: subprocess.py(1230): gc_was_enabled = gc.isenabled() subprocess.py(1233): gc.disable() subprocess.py(1234): try: subprocess.py(1235): self.pid = os.fork() subprocess.py(1240): self._child_created = True subprocess.py(1241): if self.pid == 0: subprocess.py(1312): if gc_was_enabled: subprocess.py(1313): gc.enable() subprocess.py(1316): os.close(errpipe_write) subprocess.py(1319): data = _eintr_retry_call(os.read, errpipe_read, 1048576) --- modulename: subprocess, funcname: _eintr_retry_call subprocess.py(474): while True: subprocess.py(475): try: subprocess.py(476): return func(*args) subprocess.py(1320): pickle_bits = [] subprocess.py(1321): while data: subprocess.py(1324): data = "".join(pickle_bits) subprocess.py(1326): if p2cread is not None and p2cwrite is not None: subprocess.py(1328): if c2pwrite is not None and c2pread is not None: When using commdan without tee, python -m trace --trace run_all_traces.py,It stucks at run_all_traces.py(74): proc_RL.wait() subprocess.py(1329): _close_in_parent(c2pwrite) --- modulename: subprocess, funcname: _close_in_parent subprocess.py(1221): os.close(fd) subprocess.py(1222): to_close.remove(fd) subprocess.py(1330): if errwrite is not None and errread is not None: subprocess.py(1334): os.close(errpipe_read) subprocess.py(1336): if data != "": subprocess.py(727): if mswindows: subprocess.py(735): if p2cwrite is not None: subprocess.py(737): if c2pread is not None: subprocess.py(738): if universal_newlines: subprocess.py(741): self.stdout = os.fdopen(c2pread, 'rb', bufsize) subprocess.py(742): if errread is not None: run_all_traces.py(65): time.sleep(0.1) run_all_traces.py(74): proc_RL.wait() --- modulename: subprocess, funcname: wait subprocess.py(1390): while self.returncode is None: subprocess.py(1391): try: subprocess.py(1392): pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0) --- modulename: subprocess, funcname: _eintr_retry_call subprocess.py(474): while True: subprocess.py(475): try: subprocess.py(476): return func(*args)

I pasted the content into 5.txt.

I have sent two files to your email.

tylercross commented 6 years ago

Need to add that, in the chrome retry logas shown below. Why is my URL invalid?I can now run out real-exp and generate data. chrome retry log BOLA_test_fcc_trace_5554_http---www.amazon.com_60 Message: u'unknown error: unhandled inspector error: {"code":-32603,"message":"Cannot navigate to invalid URL"}\n (Session info: chrome=54.0.2840.90)\n (Driver info: chromedriver=2.27.440175 (9bc1d90b8bfa4dd181fbbf769a5eb5e575574320),platform=Linux 4.10.0-40-generic x86_64)'

FIXED_test_fcc_trace_5554_http---www.amazon.com_60 Message: u'unknown error: unhandled inspector error: {"code":-32603,"message":"Cannot navigate to invalid URL"}\n (Session info: chrome=54.0.2840.90)\n (Driver info: chromedriver=2.27.440175 (9bc1d90b8bfa4dd181fbbf769a5eb5e575574320),platform=Linux 4.10.0-40-generic x86_64)'

hongzimao commented 6 years ago

Can you find out what URL it is getting?

tylercross commented 6 years ago

URL is url = 'http://' + ip + '/' + 'myindex_' + abr_algo + '.html'(ip might be a ipv4 address) On the issue of invalid url, I found in run_exp / run_all_traces.py os.system ('sudo sysctl -w net.ipv4.ip_forward = 1') Is on ipv4, and ubuntu default priority ipv6, I turned off the default ipv6 settings. Unfortunately, there is a new hint in the chrome_retry_log now, and the files in the result folder are all blank, as before. ` chrome retry log FIXED_test_fcc_trace_5554_http --- www.amazon.com_60 Message: u'timeout: can not determine loading status \ nfrom timeout: Timed out receiving message from renderer: -0.031 \ n (Session info: chrome = 54.0.2840.90) \ n (Driver info: chromedriver = 2.27.440175 (9bc1d90b8bfa4dd181fbbf769a5eb5e575574320) platform = Linux 4.10.0-40-generic x86_64)

RL_test_fcc_trace_5554_http --- www.amazon.com_60 Message: u'timeout: can not determine loading status \ nfrom timeout: Timed out receiving message from renderer: -0.032 \ n (Session info: chrome = 54.0.2840.90) \ n (Driver info: chromedriver = 2.27.440175 (9bc1d90b8bfa4dd181fbbf769a5eb5e575574320) platform = Linux 4.10.0-40-generic x86_64) ` It seems to be a problem about time out, for which I use google query, it was introduced chromedriver version is too low. So I upgraded to chromedriver2.27 + chrome54 now chromedriver2.33 + chrome62. chrome_retry_log only shows up chrome retry log, but nothing else. In the meantime, I ran run_exp.py in real_exp chrome_retry_log is shown below:

` chrome retry log RL_0 Message: u'timeout \ n (Session info: chrome = 62.0.3202.94) \ n (Driver info: chromedriver = 2.33.506092 (733a02544d189eeb751fe0d7ddca79a0ee28cce4), platform = Linux 4.10.0-40-generic x86_64) '

RL_0 Message: u'timeout \ n (Session info: chrome = 62.0.3202.94) \ n (Driver info: chromedriver = 2.33.506092 (733a02544d189eeb751fe0d7ddca79a0ee28cce4), platform = Linux 4.10.0-40-generic x86_64) ' ` Although the "done" flag for each tracefile has not been printed, there is still normal result generation in the real_exp / results folder, as before. Can you tell me which chromedriver and chrome versions are you using? Or indicate where there is a problem. Thank you for your attention.

hongzimao commented 6 years ago

Could you print out what URL exactly is it getting?

The way this experiment works is that it has to connect across Mahimahi shell. I really can't see the problem from this massive logs. We have to narrow down the problem.

hongzimao commented 6 years ago

We used selenium-2.39.0 (https://github.com/hongzimao/pensieve/blob/master/setup.py#L15) and the stable version of chrome on July 14, 2017 (https://github.com/hongzimao/pensieve/blob/master/setup.py#L26-L28).

ZhangYi19941217 commented 6 years ago

@tylercross I have met the same problem as you. I can run ./test/rl_no_training.py and plot pictures, but there are some mistakes running other algorithms. Besides, when I execute ./run_exp/run_all_traces.py, it occurs the same mistakes as yours.

Have you solved the problem?

hongzimao commented 6 years ago

The person on this issue https://github.com/hongzimao/pensieve/issues/23 seems to recently run our code with no problem and might give you guys some insights

SiChen-cuc commented 6 years ago

@tylercross @tylercross I have met the same problem as you. Have you solved the problem? Could you tell me the solution of this problem?

hongzimao commented 6 years ago

Not sure if you manage to make this work? You might find this issue useful: https://github.com/hongzimao/pensieve/issues/45

SuperStoneliu commented 6 years ago

i have meet the similar problem when i run run_all_traces.py , it will stay a state as follows: net.ipv4.ip_forward = 1 and when i "ctrl + c" to stop it , it will show as follows: * net.ipv4.ip_forward = 1 ^CTraceback (most recent call last): File "run_traces.py", line 49, in main() File "run_traces.py", line 36, in main Traceback (most recent call last): (out, err) = proc.communicate() File "/usr/lib/python2.7/subprocess.py", line 800, in communicate File "run_traces.py", line 49, in Traceback (most recent call last): File "run_all_traces.py", line 77, in main() File "run_traces.py", line 36, in main proc_BB.wait() (out, err) = proc.communicate() File "/usr/lib/python2.7/subprocess.py", line 1392, in wait File "/usr/lib/python2.7/subprocess.py", line 800, in communicate return self._communicate(input) File "/usr/lib/python2.7/subprocess.py", line 1417, in _communicate Traceback (most recent call last): File "run_traces.py", line 49, in return self._communicate(input) main() File "/usr/lib/python2.7/subprocess.py", line 1417, in _communicate File "run_traces.py", line 34, in main stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True) File "/usr/lib/python2.7/subprocess.py", line 711, in init pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0) File "/usr/lib/python2.7/subprocess.py", line 476, in _eintr_retry_call stdout, stderr = self._communicate_with_poll(input) Traceback (most recent call last): File "/usr/lib/python2.7/subprocess.py", line 1471, in _communicate_with_poll File "run_traces.py", line 49, in errread, errwrite) return func(args) File "/usr/lib/python2.7/subprocess.py", line 1319, in _execute_child KeyboardInterrupt Traceback (most recent call last): stdout, stderr = self._communicate_with_poll(input) File "/usr/lib/python2.7/subprocess.py", line 1471, in _communicate_with_poll File "run_traces.py", line 49, in main() File "run_traces.py", line 36, in main (out, err) = proc.communicate() File "/usr/lib/python2.7/subprocess.py", line 800, in communicate ready = poller.poll() data = _eintr_retry_call(os.read, errpipe_read, 1048576) KeyboardInterrupt File "/usr/lib/python2.7/subprocess.py", line 476, in _eintr_retry_call

return self._communicate(input)

File "/usr/lib/python2.7/subprocess.py", line 1417, in _communicate return func(*args) ready = poller.poll() KeyboardInterruptKeyboardInterrupt

main()

File "run_traces.py", line 34, in main stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True) File "/usr/lib/python2.7/subprocess.py", line 711, in init Traceback (most recent call last): File "run_traces.py", line 49, in main() File "run_traces.py", line 36, in main (out, err) = proc.communicate() stdout, stderr = self._communicate_with_poll(input) File "/usr/lib/python2.7/subprocess.py", line 800, in communicate File "/usr/lib/python2.7/subprocess.py", line 1471, in _communicate_with_poll return self._communicate(input) File "/usr/lib/python2.7/subprocess.py", line 1417, in _communicate ready = poller.poll() KeyboardInterrupt stdout, stderr = self._communicate_with_poll(input) File "/usr/lib/python2.7/subprocess.py", line 1471, in _communicate_with_poll ready = poller.poll() KeyboardInterrupt errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1319, in _execute_child Traceback (most recent call last): File "run_traces.py", line 49, in main() File "run_traces.py", line 36, in main (out, err) = proc.communicate() File "/usr/lib/python2.7/subprocess.py", line 800, in communicate return self._communicate(input) File "/usr/lib/python2.7/subprocess.py", line 1417, in _communicate stdout, stderr = self._communicate_with_poll(input) File "/usr/lib/python2.7/subprocess.py", line 1471, in _communicate_with_poll data = _eintr_retry_call(os.read, errpipe_read, 1048576) File "/usr/lib/python2.7/subprocess.py", line 476, in _eintr_retry_call return func(args) KeyboardInterrupt ready = poller.poll() Traceback (most recent call last): KeyboardInterrupt File "run_traces.py", line 49, in main() File "run_traces.py", line 34, in main stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True) File "/usr/lib/python2.7/subprocess.py", line 711, in init errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1319, in _execute_child data = _eintr_retry_call(os.read, errpipe_read, 1048576) File "/usr/lib/python2.7/subprocess.py", line 476, in _eintr_retry_call return func(args) KeyboardInterrupt ** i try to solve this problem for a long time but didn't success , can you help me ? @hongzimao @tylercross @ZhangYi19941217 @SiChen-cuc @ravinet

amrebaid commented 5 years ago

Have you, folks, figured this out? I am having the same problem. Cannot run the experiments. Please, I need help urgently.

hongzimao commented 5 years ago

Can you make sure you follow the instruction and install the right selenium version? There are a few people successfully reproduce our results recently. The most recent one (a few days ago) posted his experience in #60 and #63. You might find it useful.