kevinsartiano / airline-price-discrimination-project

Collection of scripts to gather data for a project in price discrimination in the airline market.
2 stars 0 forks source link

nordvpn #1

Open ghost opened 3 years ago

ghost commented 3 years ago

Hi! I am trying to run your code, but I permanently get the following (below). Any hint on how to fix it?

Many thanks!

2021-09-21 14:39:06,081 INFO: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2021-09-21 14:39:06,081 INFO: ALITALIA SCRAPING SESSION STARTED 2021-09-21 14:39:06,084 ERROR: Error: [Errno 2] No such file or directory: 'nordvpn' 2021-09-21 14:39:06,085 ERROR: Error: [Errno 2] No such file or directory: 'nordvpn' 2021-09-21 14:39:06,087 ERROR: Error: [Errno 2] No such file or directory: 'nordvpn' 2021-09-21 14:39:06,088 ERROR: Error: [Errno 2] No such file or directory: 'nordvpn' 2021-09-21 14:39:06,088 INFO: Alitalia scraping session completed in 0:00:00 2021-09-21 14:39:06,088 INFO: - - - - - - - - - - - - - - - - - - - - - - 2021-09-21 14:39:11,093 INFO: RYANAIR SCRAPING SESSION STARTED 2021-09-21 14:39:11,102 ERROR: Error: [Errno 2] No such file or directory: 'nordvpn' 2021-09-21 14:39:11,110 ERROR: Error: [Errno 2] No such file or directory: 'nordvpn' 2021-09-21 14:39:11,112 ERROR: Error: [Errno 2] No such file or directory: 'nordvpn' 2021-09-21 14:39:11,113 ERROR: Error: [Errno 2] No such file or directory: 'nordvpn' 2021-09-21 14:39:11,114 INFO: Ryanair scraping session completed in 0:00:00 2021-09-21 14:39:11,114 INFO: - - - - - - - - - - - - - - - - - - - - - - 2021-09-21 14:39:16,119 INFO: LUFTHANSA SCRAPING SESSION STARTED 2021-09-21 14:39:16,128 ERROR: Error: [Errno 2] No such file or directory: 'nordvpn' 2021-09-21 14:39:16,136 ERROR: Error: [Errno 2] No such file or directory: 'nordvpn' 2021-09-21 14:39:16,144 ERROR: Error: [Errno 2] No such file or directory: 'nordvpn' 2021-09-21 14:39:16,152 ERROR: Error: [Errno 2] No such file or directory: 'nordvpn' 2021-09-21 14:39:16,153 INFO: Lufthansa scraping session completed in 0:00:00 2021-09-21 14:39:16,153 INFO: - - - - - - - - - - - - - - - - - - - - - - 2021-09-21 14:39:21,159 INFO: Total scraping completed in 0:00:15 2021-09-21 14:39:21,159 INFO: <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

kevinsartiano commented 3 years ago

Hi @micaelamaria,

TL;DR You would need a NordVPN subscription and the terminal application for a Linux distribution.

FULL ANSWER Before each scraping session, the scraper tries to establish a VPN connection.

This happens in the run_scrapers.py:

subprocess.run(['nordvpn', 'connect', f'{user["vpn_server"]}'])

In this case is NordVPN, but you could theoretically change this line to connect to a VPN provider of your choice (as long as you can connect to it via terminal).

Once the connection is established, the scraper checks that the VPN IP address corresponds to the one set in the user profile config file (so if you end up changing the VPN provider, you should adapt this config file).

The check happens again in the run_scrapers.py:

if ip_address != user["ip_address"]:
                    logger.error(f'{user["user"]}: IP address should be {user["ip_address"]} '
                                 f'instead of {ip_address}!')

This is done to ensure that all the data scraped under a specific user is scraped always from the corresponding user IP address.

I hope that helps!

ghost commented 3 years ago

Thank you! So, basically, if I install nordvpn it should work all fine.

kevinsartiano commented 3 years ago

Yes, if you run the script on a Linux distribution with the NordVPN terminal app it should work. But bear in mind that NordVPN is not a free service, you would need a subscription to it.