claffin / cloudproxy

Hide your scrapers IP behind the cloud. Provision proxy servers across different cloud providers to improve your scraping success.
https://cloudproxy.io/
MIT License
1.4k stars 79 forks source link

Add Google Cloud provider #38

Closed dusancz closed 3 years ago

dusancz commented 3 years ago

This Pull Request adds Google Cloud provider support for cloudproxy and solves #35.

Known issues:

Future work:

claffin commented 3 years ago

@dusancz how are you loading in the GCP service account key? I find the JSON content doesn't load nicely as an environment variable.

I've looked at making a change where it base64 encodes the key and then decoding it just before using it, but curious what your approach is.

Otherwise, I've just tested it and it works well. Many thanks for your contribution!

dusancz commented 3 years ago

Hi @claffin! I have a simple bash script for loading it, feel free adding it to the PR as a separate .sh file if you wish:

 #!/bin/bash
 sa=$(cat /path/to/service_account.json)

 docker run \
     -e USERNAME='username' \
     -e PASSWORD='password' \
     -e AGE_LIMIT='3000' \
     -e GCP_ENABLED=True \
     -e GCP_PROJECT='gcp-project-name' \
     -e GCP_SERVICE_ACCOUNT_KEY="$sa" \
     -it -p 8000:8000 cloudproxy:latest
claffin commented 3 years ago

Thanks, @dusancz for the clarification, I've added additional guidance instead to cover.

I will merge this now and release it. Many thanks again for your contribution!

One thing that I may look at doing in the future is the automatic creation of the firewall rules, as that would make it even easier to set up. I will raise an issue.

On the topic of the remove button, it does work for me but I notice it doesn't work on my mobile so not how big an issue it is. If you're able to investigate any further and see where the problem may be, feel free to raise an issue and I will take a closer look.

In terms of refactoring, I agree, the code has a lot of duplication. Especially the main.py files for each provider are near all duplicates with minor differences. No reason why these couldn't be all in one file in a for loop. I will raise an issue.