Closed gunesacar closed 2 years ago
Another data point: the problem goes away if I comment out isMobile
and hasTouch
from MOBILE_VIEWPORT
, while keeping the page.setViewport
call.
Hey Gunes, thanks for the report! I gave it a quick look and it seems like an upstream issue (chromium/puppeteer) to me. The workaround is to set all mobile options on browser launch:
function openBrowser(log, proxyHost) {
const args = {
defaultViewport: MOBILE_VIEWPORT
};
and comment out
// page.setViewport(emulateMobile ? MOBILE_VIEWPORT : DEFAULT_VIEWPORT);
I'll land a proper fix at some point, but please use the workaround for now.
BTW Congrats on https://arxiv.org/pdf/2102.09301.pdf , well done 👏 Please feel to reach out to me directly (konrad at duckduckgo.com) if you'll have any thoughts about the crawler or would like to use Tracker Radar data in your research (we are crawling over 150k pages on regular basis and can adjust the crawler to collect more data if needed).
@kdzwinel Thanks so much for promptly addressing this. It makes sense that this is an upstream issue.
BTW Congrats on https://arxiv.org/pdf/2102.09301.pdf , well done clap
Thank you! Much of the credit goes to @ydimova. For the record, our experience using tracker-radar-collector
for the study was just great. I especially appreciated how easy it is to add new instrumentation, since your method based on Runtime.evaluate
is so generic (and novel). Also the tool is super easy to start with, and was quite stable handling tens of thousands of sites without any hiccups. I am certain that tracker-radar-collector
will be a popular tool (along with OpenWPM) within the research community not long from now.
Please feel to reach out to me directly (konrad at duckduckgo.com) if you'll have any thoughts about the crawler or would like to use Tracker Radar data in your research (we are crawling over 150k pages on regular basis and can adjust the crawler to collect more data if needed).
I'll be more than happy to reach out. We have other projects that are based on tracker-radar-collector
and I think it'd be useful to keep a channel open.
The
-m, --mobile
option seems to be causingtracker-radar-collector
to fail during page load:$ npm run crawl -- -u "https://duck.com" -o /tmp/ -v -f -d "requests" --mobile
gives me:The same crawl without the
--mobile
option runs just fine:$ npm run crawl -- -u "https://duck.com" -o /tmp/ -v -f -d "requests"
I and @asumansenol could reliably reproduce this error on a few different machines using the latest from the
main
branch.The same error in a parallel crawl (e.g. with
c=4
) includes some error log about browser being disconnected.Since the
page.setViewport
call is one of the main differences between the desktop and the mobile crawl, I commented that line out and rerun a mobile crawl. I didn't get any errors!Let me know if you need any other information from me to help you solve the issue.