Closed acreskeyMoz closed 5 years ago
Here are browsertime logs for a session in which it looks like tsproxy got into a bad state (it was logging Tick Time: 0.001
continuously, but I restarted tsproxy after iteration 5 and you can see the performance metrics return to normal.
browsertime_logs_restart_tsproxy_after_iteration_5.txt
I've found a simpler method to reproduce this issue (the method doesn't require Browsertime).
Start tsproxy
on your host machine, e.g. python tsproxy.py -b {host_ip} --rtt=200 -vvvv
Configure Chrome for android to point to tsproxy
on your host machine, e.g. --proxy-server=socks://{host_ip}:1080
Run the following bash script to repeatedly clear and launch Chrome, loading the provided website
i=0
while [ $i -le 40 ] do adb shell am force-stop com.android.chrome adb shell pm clear com.android.chrome
adb shell am start -n com.android.chrome/com.google.android.apps.chrome.Main -a android.intent.action.VIEW -d "https://www.bbc.com/news/business-47245877"
echo Number: $i ((i++)) sleep 10s done
4. Notice that initially the page will have time to load.
After a number of iterations, `tsproxy` will be begin to output something like:
11:51:06.587 - Tick Time: 0.001 11:51:06.589 - Tick Time: 0.001 11:51:06.590 - Tick Time: 0.001
and at this point Chrome will no longer load the page within the 10 seconds given.
If the bash script is cancelled at the point, Chrome will generally hang mid load.
No promises on a fix but I'm looking at it now. Managed to reproduce it with a simple puppeteer script:
const puppeteer = require('puppeteer');
(async () => {
while (true) {
const browser = await puppeteer.launch({
headless: false,
args: ['--proxy-server=socks://127.0.0.1:1080'],
timeout: 10000
});
const page = await browser.newPage();
await page.goto('https://www.bbc.com/news/business-47245877');
await browser.close();
}
})();
I'm hoping it's a clock problem with a packet getting stuck in one of the queues with an invalid time and maybe it will be a simple check to detect when that happens.
Fix is inbound (just letting it test for a while to make sure it is stable first). Looks like the task_done in the queue was somehow getting called more often than there were queue items and it would start throwing an exception. I don't see how that's possible since it's paired 1:1 with pulling items from the queue but since we don't join() the packet queue we don't actually need to know the task status so I just removed it.
So far it has been running for ~30 minutes with a test every 10 seconds with now issues.
Thank you @pmeenan
With the latest commit tsproxy
is passing both my shell script test as well as long-running browsertime runs.
Nicely done!
When using tsproxy for repeated browsertime page loads we intermittently see a huge drop off in performance after a few minutes.
We are using browsertime to drive Chrome for Android, with tsproxy as the network throttler. Generally WebPageReplay is used as the http archive.
Once the performance has dropped, it will persist in that state until the tsproxy process is restarted.
I believe the problem is coming from tsproxy because when I remove WebPageReplay from the equation we can still reproduce the issue (i.e. when using tsproxy to throttle access to lives sites we still see the problem).
Similarly, we've not seen this issue when using browsertime and Chrome directly against live sites.
I'm seeing this behaviour on MacOS (High Sierra) and my colleague is seeing it on Ubuntu linux.
I've attached both the tsproxy logs (verbose) as well as browsertime's.
In this case it the problem surfaced in slightly under 5 minutes at around
10:57:52
in the logs. You can see the browsertimebackendTime
metric jump from ~2100ms to ~18000ms on iteration 4 onwards.I noticed that from tsproxy's logs there are no more long tick times (e.g. 1 second), and you can see a continuous spew of 0.001 Tick Time's:
tsproxy is started as follows (verbose logging added here), in this case bound and directed to the host machine's local ip where WebPageReplay is running:
python tsproxy.py -b 192.168.86.244 -d 192.168.86.244 -m '80:8080,443:8081' --rtt=200 -vvvvv
For reasons unknown, we see this problem more frequently with the following urls,
We're on the latest commit of tsproxy: 6ccc0eb5a8d308608d155545d265bd425765d724
Let me know if there is additional information I can provide.
tsproxy_logs.txt webpagereplay logs.txt