mozilla / overscripted

Repository for the Mozilla Overscripted Data Mining Challenge
Mozilla Public License 2.0
74 stars 53 forks source link

Geolocation and user language extraction analysis: issue #37 #100

Open sanchittechnogeek opened 5 years ago

sanchittechnogeek commented 5 years ago

Analysis:

My analysis revolves around finding what percentage of / which websites in this dataset are tracking users location and language preferences so as to provide them with a customized content based on the user's preferences (eg. location, language)

Dataset used: Sample 10 percent

sanchittechnogeek commented 5 years ago
  • One final thought - if you're trying to look at differences in content delivery is this the dataset to do it? If not, why not? What would you change about the data collection? What other data would you like?

I was looking for getCurrentPosition() function calls but the crawler wasn't able to detect it properly except that it did at one location. So for changing the data collection, I would like the crawler to be run dedicatedly to detect the function calls. One other thing I would like to do is to run crawlers from different locations simultaneously so as to find what scripts are being run in/from a particular region only.