Open mattyjacks opened 1 month ago
@mattyjacks it is nice that it works for you but I have a few points here:
(1) The webapp is NOT DESIGNED (at the moment ) to be publicly available for security purposes. I HIGHLY recommend you IMMEDIATELY allow ONLY your ip to access the tool until an authentication system is in place.
(2) I think it's easier to run it via a docker container.
Thank you very much for trying this into AWS
Thank you for the quick response.
1: I'll be shutting down the tool as soon as this scrape-job is finished (to save money), and when I revive it a new IP address will be assigned from Amazon anyways. I wasn't planning on sharing the IP address that would let others access it.
In response to 2: Yeah, probably. I've never used docker before, tho.
I'm overall very satisfied with the result. One huge advantage of the AWS EC2 approach is it's not tying my IP address to the scraping activity in Google's eyes. Pretty paranoid about getting banned from Google.
Even if you do not sharing the IP this is still not safe. People might break into your server.
I recommend in the firewall just to allow connections from your IP address.
Additionally, you might consider using proxies if you want to mask your IP address.
In any case the tool is for educational purposes only.
I managed to get this thing working via AWS EC2! YAY! I decided to write a little guide on it.
First thing is you launch an Ubuntu Server 24.04 LTS instance, I use a t2.xlarge ($0.18 per hour) (you can turn it off when you're not using it to save money) with 25 GB of storage.
Then you connect to the instance. Using EC2 Instance Connect with default username is fine.
Here are the commands you have to run:
git clone https://github.com/gosom/google-maps-scraper.git
sudo apt install golang-go
sudo apt-get update
sudo apt install golang-go
sudo apt-get install libatk1.0-0 libatk-bridge2.0-0 libcups2 libatspi2.0-0 libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libpango-1.0-0 libcairo2 liboss4-salsa-asound2
sudo apt-get install liboss4-salsa-asound2
sudo apt-get update
sudo apt-get upgrade
sudo apt install nodejs npm
sudo npm install -g playwright
sudo apt-get install libasound2 libasound2-plugins
rm -rf ~/.cache/ms-playwright
playwright install
sudo npx playwright install-deps
uname -m
npx playwright install firefox
npx playwright install webkit
cd google-maps-scraper
go mod download
go build
(Adjust the number after -c depending on the number of cores your EC2 instance has, 1 less than the number of cores you have, the EC2 Instance I chose has 4 cores)
./google-maps-scraper -web -c 3
Edit inbound security group rules of the EC2 Instance to allow 8080 port range from anywhere
Visit the port 8080 of the public IP address of the EC2, like 54.147.206.100:8080 , be sure to use HTTP instead of HTTPS or it won't connect
Above is what the scraper looks like in action.
THANK YOU @gosom FOR YOUR WONDERFUL TOOL!