Closed aeametal closed 7 years ago
It appears that you haven't installed phantomjs
, or that it's not in your PATH
.
Here's the SSCCE for phantomjs
. If this doesn't run, the script won't run.
which phantomjs
python3 -c 'from selenium import webdriver; driver = webdriver.PhantomJS(); driver.get("https://github.com"); print(driver.title); driver.quit()'
Thanks for your quick response.
Hmmm. Looks like the current_url
method is puking on your box. What happens when you do this?
python3 -c 'from selenium import webdriver; driver = webdriver.PhantomJS(); driver.get("https://github.com"); print(driver.title); print(driver. current_url); driver.quit()'
This throws an error for me. I'll wrap this part in a try
… except
.
A suggestion: Let users add a list.csv containing URLs of their choosing to pollute the history with non-random searches. Clustering techniques will have trouble isolating clean data if part of the non-random list is different among users.
There is a list of non-random searches at the start of the script. I'll think about looking for a specific file to add. It would just take a moment to fork the code and add this to non-random list.
I get the following output after installing and running the script: Seeding with search for 'catfish'... Expecting value: line 1 column 1 (char 0) Traceback (most recent call last): File "isp_data_pollution.py", line 483, in
ISPDataPollution(debug=True)
File "isp_data_pollution.py", line 126, in init
self.pollute_forever()
File "isp_data_pollution.py", line 211, in pollute_forever
self.seed_links()
File "isp_data_pollution.py", line 246, in seed_links
self.get_websearch(word)
File "isp_data_pollution.py", line 367, in get_websearch
if len(self.links) < self.max_links_cached: self.add_url_links(new_links)
File "isp_data_pollution.py", line 423, in add_url_links
if self.debug: print('Added {:d} links, {:d} total at url \'{}\'.'.format(k,len(self.links),self.session.current_url))
File "/usr/lib/python3/dist-packages/selenium/webdriver/remote/webdriver.py", line 454, in current_url
return self.execute(Command.GET_CURRENT_URL)['value']
File "/usr/lib/python3/dist-packages/selenium/webdriver/remote/webdriver.py", line 201, in execute
self.error_handler.check_response(response)
File "/usr/lib/python3/dist-packages/selenium/webdriver/remote/errorhandler.py", line 102, in check_response
value = json.loads(value_json)
File "/usr/lib/python3.5/json/init.py", line 319, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.5/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.5/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)