Closed tylercross closed 6 years ago
Thank you very much. I'll try it at once. And I've already submitted my questions on GitHub.------------------ Original ------------------From: Hongzi Mao notifications@github.comDate: 周五,11月 24,2017 11:50 上午To: hongzimao/pensieve pensieve@noreply.github.comCc: tylercross fanwenjin@tju.edu.cn, Author author@noreply.github.comSubject: Re: [hongzimao/pensieve] Some questions about runrun-exp/run_all_traces. (#13)Hope the solution in this issue can help you: #10
—You are receiving this because you authored the thread.Reply to this email directly, view it on GitHub, or mute the thread.
{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/hongzimao/pensieve","title":"hongzimao/pensieve","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/hongzimao/pensieve"}},"updates":{"snippets":[{"icon":"PERSON","message":"@hongzimao in #13: Hope the solution in this issue can help you: https://github.com/hongzimao/pensieve/issues/10"}],"action":{"name":"View Issue","url":"https://github.com/hongzimao/pensieve/issues/13#issuecomment-346737669"}}}
Please try to set up a testing environment the same as ours (e.g., Ubuntu 16.04, Tensorflow v1.1.0, TFLearn v0.3.1 and Selenium v2.39.0) for the purpose of reproducing the results. Specifically, get started with python setup.py
in main repo directory. Also, this issue may provide some hints for solving your problem: https://github.com/hongzimao/pensieve/issues/10. Hope it helps.
Sorry, the problem is still not solved. I reinstalled Ubuntu and the same version of tensorflow, tflearn, selenium, and ran setup.py again. My purpose is to repeat your simulation, so I use the training model you provided directly. I can execute test / rl-no-training and plot a series of pictures of rl with plot-results.py. But when run-exp / run-all-traces is executed, the problem is still the same. My question is: Do you still need other operations to support run-all_traces? test folder to draw all the results of all algorithm results run-exp operation is the premise? (In the test part I can only guarantee that rl can draw the result, the other more or less have some minor problems, so temporarily skip)
Which line of code exactly was this that throw you an error? Or does it get stuck somewhere indefinitely? Please narrow down the problem to the most specific that you can. Thanks!
When using command python -m trace --trace run_all_traces.py | tee 4.txt, It stucks at
run_all_traces.py(64): proc_RL = subprocess.Popen(command_RL, stdout=subprocess.PIPE, shell=True)
run_all_traces.py(64): proc_RL = subprocess.Popen(command_RL, stdout=subprocess.PIPE, shell=True) --- modulename: subprocess, funcname: __init__ subprocess.py(657): _cleanup() --- modulename: subprocess, funcname: _cleanup subprocess.py(459): for inst in _active[:]: subprocess.py(659): if not isinstance(bufsize, (int, long)): subprocess.py(662): if mswindows: subprocess.py(672): if startupinfo is not None: subprocess.py(675): if creationflags != 0: subprocess.py(679): self.stdin = None subprocess.py(680): self.stdout = None subprocess.py(681): self.stderr = None subprocess.py(682): self.pid = None subprocess.py(683): self.returncode = None subprocess.py(684): self.universal_newlines = universal_newlines subprocess.py(703): errread, errwrite), to_close = self._get_handles(stdin, stdout, stderr) --- modulename: subprocess, funcname: _get_handles subprocess.py(1112): to_close = set() subprocess.py(1113): p2cread, p2cwrite = None, None subprocess.py(1114): c2pread, c2pwrite = None, None subprocess.py(1115): errread, errwrite = None, None subprocess.py(1117): if stdin is None: subprocess.py(1118): pass subprocess.py(1128): if stdout is None: subprocess.py(1130): elif stdout == PIPE: subprocess.py(1131): c2pread, c2pwrite = self.pipe_cloexec() --- modulename: subprocess, funcname: pipe_cloexec subprocess.py(1179): r, w = os.pipe() subprocess.py(1180): self._set_cloexec_flag(r) --- modulename: subprocess, funcname: _set_cloexec_flag subprocess.py(1161): try: subprocess.py(1162): cloexec_flag = fcntl.FD_CLOEXEC subprocess.py(1166): old = fcntl.fcntl(fd, fcntl.F_GETFD) subprocess.py(1167): if cloexec: subprocess.py(1168): fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) subprocess.py(1181): self._set_cloexec_flag(w) --- modulename: subprocess, funcname: _set_cloexec_flag subprocess.py(1161): try: subprocess.py(1162): cloexec_flag = fcntl.FD_CLOEXEC subprocess.py(1166): old = fcntl.fcntl(fd, fcntl.F_GETFD) subprocess.py(1167): if cloexec: subprocess.py(1168): fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) subprocess.py(1182): return r, w subprocess.py(1132): to_close.update((c2pread, c2pwrite)) subprocess.py(1139): if stderr is None: subprocess.py(1140): pass subprocess.py(1155): return (p2cread, p2cwrite, subprocess.py(1156): c2pread, c2pwrite, subprocess.py(1157): errread, errwrite), to_close subprocess.py(705): try: subprocess.py(706): self._execute_child(args, executable, preexec_fn, close_fds, subprocess.py(707): cwd, env, universal_newlines, subprocess.py(708): startupinfo, creationflags, shell, to_close, subprocess.py(709): p2cread, p2cwrite, subprocess.py(710): c2pread, c2pwrite, subprocess.py(711): errread, errwrite) --- modulename: subprocess, funcname: _execute_child subprocess.py(1207): if isinstance(args, types.StringTypes): subprocess.py(1208): args = [args] subprocess.py(1212): if shell: subprocess.py(1213): args = ["/bin/sh", "-c"] + args subprocess.py(1214): if executable: subprocess.py(1217): if executable is None: subprocess.py(1218): executable = args[0] subprocess.py(1220): def _close_in_parent(fd): subprocess.py(1227): errpipe_read, errpipe_write = self.pipe_cloexec() --- modulename: subprocess, funcname: pipe_cloexec subprocess.py(1179): r, w = os.pipe() subprocess.py(1180): self._set_cloexec_flag(r) --- modulename: subprocess, funcname: _set_cloexec_flag subprocess.py(1161): try: subprocess.py(1162): cloexec_flag = fcntl.FD_CLOEXEC subprocess.py(1166): old = fcntl.fcntl(fd, fcntl.F_GETFD) subprocess.py(1167): if cloexec: subprocess.py(1168): fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) subprocess.py(1181): self._set_cloexec_flag(w) --- modulename: subprocess, funcname: _set_cloexec_flag subprocess.py(1161): try: subprocess.py(1162): cloexec_flag = fcntl.FD_CLOEXEC subprocess.py(1166): old = fcntl.fcntl(fd, fcntl.F_GETFD) subprocess.py(1167): if cloexec: subprocess.py(1168): fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) subprocess.py(1182): return r, w subprocess.py(1228): try: subprocess.py(1229): try: subprocess.py(1230): gc_was_enabled = gc.isenabled() subprocess.py(1233): gc.disable() subprocess.py(1234): try: subprocess.py(1235): self.pid = os.fork() subprocess.py(1240): self._child_created = True subprocess.py(1241): if self.pid == 0: subprocess.py(1243): try: subprocess.py(1245): if p2cwrite is not None: subprocess.py(1247): if c2pread is not None: subprocess.py(1248): os.close(c2pread) subprocess.py(1249): if errread is not None: subprocess.py(1251): os.close(errpipe_read) subprocess.py(1256): if c2pwrite == 0: subprocess.py(1258): if errwrite == 0 or errwrite == 1: subprocess.py(1262): def _dup2(a, b): subprocess.py(1270): _dup2(p2cread, 0) --- modulename: subprocess, funcname: _dup2 subprocess.py(1266): if a == b: subproexec_flag subprocess.py(1161): try: subprocess.py(1162): cloexec_flag = fcntl.FD_CLOEXEC subprocess.py(1166): old = fcntl.fcntl(fd, fcntl.F_GETFD) subprocess.py(1167): if cloexec: subprocess.py(1168): fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) subprocess.py(1182): return r, w subprocess.py(1132): to_close.update((c2pread, c2pwrite)) subprocess.py(1139): if stderr is None: subprocess.py(1140): pass subprocess.py(1155): return (p2cread, p2cwrite, subprocess.py(1156): c2pread, c2pwrite, subprocess.py(1157): errread, errwrite), to_close subprocess.py(705): try: subprocess.py(706): self._execute_child(args, executable, preexec_fn, close_fds, subprocess.py(707): cwd, env, universal_newlines, subprocess.py(708): startupinfo, creationflags, shell, to_close, subprocess.py(709): p2cread, p2cwrite, subprocess.py(710): c2pread, c2pwrite, subprocess.py(711): errread, errwrite) --- modulename: subprocess, funcname: _execute_child subprocess.py(1207): if isinstance(args, types.StringTypes): subprocess.py(1208): args = [args] subprocess.py(1212): if shell: subprocess.py(1213): args = ["/bin/sh", "-c"] + args subprocess.py(1214): if executable: subprocess.py(1217): if executable is None: subprocess.py(1218): executable = args[0] subprocess.py(1220): def _close_in_parent(fd): subprocess.py(1227): errpipe_read, errpipe_write = self.pipe_cloexec() --- modulename: subprocess, funcname: pipe_cloexec subprocess.py(1179): r, w = os.pipe() subprocess.py(1180): self._set_cloexec_flag(r) --- modulename: subprocess, funcname: _set_cloexec_flag subprocess.py(1161): try: subprocess.py(1162): cloexec_flag = fcntl.FD_CLOEXEC subprocess.py(1166): old = fcntl.fcntl(fd, fcntl.F_GETFD) subprocess.py(1167): if cloexec: subprocess.py(1168): fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) subprocess.py(1181): self._set_cloexec_flag(w) --- modulename: subprocess, funcname: _set_cloexec_flag subprocess.py(1161): try: subprocess.py(1162): cloexec_flag = fcntl.FD_CLOEXEC subprocess.py(1166): old = fcntl.fcntl(fd, fcntl.F_GETFD) subprocess.py(1167): if cloexec: subprocess.py(1168): fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) subprocess.py(1182): return r, w subprocess.py(1228): try: subprocess.py(1229): try: subprocess.py(1230): gc_was_enabled = gc.isenabled() subprocess.py(1233): gc.disable() subprocess.py(1234): try: subprocess.py(1235): self.pid = os.fork() subprocess.py(1240): self._child_created = True subprocess.py(1241): if self.pid == 0: subprocess.py(1312): if gc_was_enabled: subprocess.py(1313): gc.enable() subprocess.py(1316): os.close(errpipe_write) subprocess.py(1319): data = _eintr_retry_call(os.read, errpipe_read, 1048576) --- modulename: subprocess, funcname: _eintr_retry_call subprocess.py(474): while True: subprocess.py(475): try: subprocess.py(476): return func(*args) subprocess.py(1320): pickle_bits = [] subprocess.py(1321): while data: subprocess.py(1324): data = "".join(pickle_bits) subprocess.py(1326): if p2cread is not None and p2cwrite is not None: subprocess.py(1328): if c2pwrite is not None and c2pread is not None:
When using commdan without tee, python -m trace --trace run_all_traces.py,It stucks at
run_all_traces.py(74): proc_RL.wait()
subprocess.py(1329): _close_in_parent(c2pwrite) --- modulename: subprocess, funcname: _close_in_parent subprocess.py(1221): os.close(fd) subprocess.py(1222): to_close.remove(fd) subprocess.py(1330): if errwrite is not None and errread is not None: subprocess.py(1334): os.close(errpipe_read) subprocess.py(1336): if data != "": subprocess.py(727): if mswindows: subprocess.py(735): if p2cwrite is not None: subprocess.py(737): if c2pread is not None: subprocess.py(738): if universal_newlines: subprocess.py(741): self.stdout = os.fdopen(c2pread, 'rb', bufsize) subprocess.py(742): if errread is not None: run_all_traces.py(65): time.sleep(0.1) run_all_traces.py(74): proc_RL.wait() --- modulename: subprocess, funcname: wait subprocess.py(1390): while self.returncode is None: subprocess.py(1391): try: subprocess.py(1392): pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0) --- modulename: subprocess, funcname: _eintr_retry_call subprocess.py(474): while True: subprocess.py(475): try: subprocess.py(476): return func(*args)
I pasted the content into 5.txt.
I have sent two files to your email.
Need to add that, in the chrome retry logas shown below. Why is my URL invalid?I can now run out real-exp and generate data. chrome retry log BOLA_test_fcc_trace_5554_http---www.amazon.com_60 Message: u'unknown error: unhandled inspector error: {"code":-32603,"message":"Cannot navigate to invalid URL"}\n (Session info: chrome=54.0.2840.90)\n (Driver info: chromedriver=2.27.440175 (9bc1d90b8bfa4dd181fbbf769a5eb5e575574320),platform=Linux 4.10.0-40-generic x86_64)'
FIXED_test_fcc_trace_5554_http---www.amazon.com_60 Message: u'unknown error: unhandled inspector error: {"code":-32603,"message":"Cannot navigate to invalid URL"}\n (Session info: chrome=54.0.2840.90)\n (Driver info: chromedriver=2.27.440175 (9bc1d90b8bfa4dd181fbbf769a5eb5e575574320),platform=Linux 4.10.0-40-generic x86_64)'
Can you find out what URL it is getting?
URL is url = 'http://' + ip + '/' + 'myindex_' + abr_algo + '.html'(ip might be a ipv4 address) On the issue of invalid url, I found in run_exp / run_all_traces.py os.system ('sudo sysctl -w net.ipv4.ip_forward = 1') Is on ipv4, and ubuntu default priority ipv6, I turned off the default ipv6 settings. Unfortunately, there is a new hint in the chrome_retry_log now, and the files in the result folder are all blank, as before. ` chrome retry log FIXED_test_fcc_trace_5554_http --- www.amazon.com_60 Message: u'timeout: can not determine loading status \ nfrom timeout: Timed out receiving message from renderer: -0.031 \ n (Session info: chrome = 54.0.2840.90) \ n (Driver info: chromedriver = 2.27.440175 (9bc1d90b8bfa4dd181fbbf769a5eb5e575574320) platform = Linux 4.10.0-40-generic x86_64)
RL_test_fcc_trace_5554_http --- www.amazon.com_60 Message: u'timeout: can not determine loading status \ nfrom timeout: Timed out receiving message from renderer: -0.032 \ n (Session info: chrome = 54.0.2840.90) \ n (Driver info: chromedriver = 2.27.440175 (9bc1d90b8bfa4dd181fbbf769a5eb5e575574320) platform = Linux 4.10.0-40-generic x86_64) ` It seems to be a problem about time out, for which I use google query, it was introduced chromedriver version is too low. So I upgraded to chromedriver2.27 + chrome54 now chromedriver2.33 + chrome62. chrome_retry_log only shows up chrome retry log, but nothing else. In the meantime, I ran run_exp.py in real_exp chrome_retry_log is shown below:
` chrome retry log RL_0 Message: u'timeout \ n (Session info: chrome = 62.0.3202.94) \ n (Driver info: chromedriver = 2.33.506092 (733a02544d189eeb751fe0d7ddca79a0ee28cce4), platform = Linux 4.10.0-40-generic x86_64) '
RL_0 Message: u'timeout \ n (Session info: chrome = 62.0.3202.94) \ n (Driver info: chromedriver = 2.33.506092 (733a02544d189eeb751fe0d7ddca79a0ee28cce4), platform = Linux 4.10.0-40-generic x86_64) ' ` Although the "done" flag for each tracefile has not been printed, there is still normal result generation in the real_exp / results folder, as before. Can you tell me which chromedriver and chrome versions are you using? Or indicate where there is a problem. Thank you for your attention.
Could you print out what URL exactly is it getting?
The way this experiment works is that it has to connect across Mahimahi shell. I really can't see the problem from this massive logs. We have to narrow down the problem.
We used selenium-2.39.0
(https://github.com/hongzimao/pensieve/blob/master/setup.py#L15) and the stable version of chrome on July 14, 2017 (https://github.com/hongzimao/pensieve/blob/master/setup.py#L26-L28).
@tylercross I have met the same problem as you. I can run ./test/rl_no_training.py and plot pictures, but there are some mistakes running other algorithms. Besides, when I execute ./run_exp/run_all_traces.py, it occurs the same mistakes as yours.
Have you solved the problem?
The person on this issue https://github.com/hongzimao/pensieve/issues/23 seems to recently run our code with no problem and might give you guys some insights
@tylercross @tylercross I have met the same problem as you. Have you solved the problem? Could you tell me the solution of this problem?
Not sure if you manage to make this work? You might find this issue useful: https://github.com/hongzimao/pensieve/issues/45
i have meet the similar problem when i run run_all_traces.py , it will stay a state as follows:
net.ipv4.ip_forward = 1
and when i "ctrl + c" to stop it , it will show as follows:
*
net.ipv4.ip_forward = 1
^CTraceback (most recent call last):
File "run_traces.py", line 49, in
return self._communicate(input)
File "/usr/lib/python2.7/subprocess.py", line 1417, in _communicate return func(*args) ready = poller.poll() KeyboardInterruptKeyboardInterrupt
main()
File "run_traces.py", line 34, in main
stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
File "/usr/lib/python2.7/subprocess.py", line 711, in init
Traceback (most recent call last):
File "run_traces.py", line 49, in
Have you, folks, figured this out? I am having the same problem. Cannot run the experiments. Please, I need help urgently.
Can you make sure you follow the instruction and install the right selenium version? There are a few people successfully reproduce our results recently. The most recent one (a few days ago) posted his experience in #60 and #63. You might find it useful.
Of course, I can run ./testing/rl_no_training.py and plot a lot of pictures(also has some problem at other algrithms). but when executing ./run_exp/run_all_traces.py according to readme,
It stays in this line: run_all_traces.py (67): proc_RL = subprocess.Popen (command_RL, stdout = subprocess.PIPE, shell = True)
It seems that this subprocess have some error,so I check called run_traces.py ,and I just run BB algorithm using a new py file.The error is that, it would stay in py(29): proc_BB = subprocess.Popen(command_BB, stdout=subprocess.PIPE, shell=True)#create a new subporcess,