catchpoint / WebPageTest.agent

Cross-platform WebPageTest agent
Other
213 stars 138 forks source link

Running concurrent instances of wptagent #516

Closed ShurayukiHime closed 2 years ago

ShurayukiHime commented 2 years ago

Hi, I am running a local instance of the WPT agent on a Ubuntu server. I tried to run parallel instances of the agent, to test several sites concurrently, but the agent does not seem to work properly. I can run the agent, but I get intermediate errors, e.g. when the agent tries to bind to addresses like 127.0.0.1:8888. Another example is the following:

Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/home/clockwork/wptagent/internal/health_check_server.py", line 102, in run
    application.listen(self.server_port, '0.0.0.0')
  File "/home/clockwork/wptagent/lib/python3.6/site-packages/tornado/web.py", line 2109, in listen
    server.listen(port, address)
  File "/home/clockwork/wptagent/lib/python3.6/site-packages/tornado/tcpserver.py", line 151, in listen
    sockets = bind_sockets(port, address=address)
  File "/home/clockwork/wptagent/lib/python3.6/site-packages/tornado/netutil.py", line 161, in bind_sockets
    sock.bind(sockaddr)
OSError: [Errno 98] Address already in use

It also seems that there are issues when saving the result files:

17:15:42.188 - Unhandled exception in test run: [Errno 2] No such file or directory: '/home/clockwork/wptagent/work/cloaking-10.0.3.3/YEXWAO5VDBQTWAD5U64Y.1.0/1_progress.csv.gz'
Traceback (most recent call last):
  File "/home/clockwork/wptagent/wptagent.py", line 306, in run_single_test
    self.browser.run_task(self.task)
  File "/home/clockwork/wptagent/internal/chrome_desktop.py", line 205, in run_task
    DevtoolsBrowser.run_task(self, task)
  File "/home/clockwork/wptagent/internal/devtools_browser.py", line 277, in run_task
    self.on_stop_recording(task)
  File "/home/clockwork/wptagent/internal/chrome_desktop.py", line 271, in on_stop_recording
    DesktopBrowser.on_stop_recording(self, task)
  File "/home/clockwork/wptagent/internal/desktop_browser.py", line 649, in on_stop_recording
    gzfile = gzip.open(file_path, GZIP_TEXT, 7)
  File "/usr/lib/python3.6/gzip.py", line 53, in open
    binary_file = GzipFile(filename, gz_mode, compresslevel)
  File "/usr/lib/python3.6/gzip.py", line 163, in __init__
    fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/home/clockwork/wptagent/work/cloaking-10.0.3.3/YEXWAO5VDBQTWAD5U64Y.1.0/1_progress.csv.gz'

How I invoke the agent: python3 wptagent.py -vvvv --xvfb --shaper none --testurl {testurl} --browser Chrome --testout json --testsoutdir "{testoutdir}" --server {serverurl} --testspec {testspec}

Can this be fixed? My goal is to have around 10 or 20 parallel instances. Thank you.

sammeboy635 commented 2 years ago

For the first problem it seems that the wptagent didn't fail cleanly so this is probably another instance running in the background. Do "ps aux | grep "python3" " and if you see any instances of wptagent do "sudo kill -9 PID"

For the second problem, It seems that --testdirout has no implementation anywhere in the code, so if you are just looking for the pagedata.json, here is a quick implementation you can do to get the pagedata.json.

image

For the third problem I need a little bit more information but it seems to me that, a DIR somewhere is not being created. If its still a problem run this and give me the outcmd.txt, maybe there is something else that is happening further up. python3 wptagent.py -vvvv (OTHER ARGS) &> outcmd.txt

ShurayukiHime commented 2 years ago

Thanks for your suggestions. My point was exactly to have multiple instances running at the same time, e.g., spawning them as subprocesses. I fixed testsoutdir and killed other background instances, but the first problem kept appearing again. Could there be problems because the agent exchanges messages on a fixed port on localhost (maybe even different ports depending on the service), and it is not possible to run concurrent instances of the agents which all try to bind to the same (address, port)?

sammeboy635 commented 2 years ago

I briefed over the parrel part on the same server. Yes you are right that there are some hard coded addresses and ports. Personally, I have not seen a use case to run this code in parrel on the same machine, since running multiple instances of Wptagent would likely change the results of each website tested.

pmeenan commented 2 years ago

Multiple instances of the agent in the same VM/OS isn't supported. The traffic shaping is applied globally and there are other assumptions made (like the back-channel server). There would also be CPU contention with multiple agents.

The only supported way to run multiple instances on the same hardware is within VM's.

Docker containers sort of work for a lighter way to run in parallel but without traffic-shaping support.