DedSecInside / gotor

This program provides efficient web scraping services for Tor and non-Tor sites. The program has both a CLI and REST API.
GNU General Public License v3.0
159 stars 44 forks source link
cli command-line command-line-tool docker go golang golang-server hacktoberfest http-server information-extraction osint osint-tools rest-api service tor torbot webcrawler webcrawling webscraping

GoTor - HTTP REST API and Web Crawling Tool with TOR Integration

This repository contains an HTTP REST API and a command-line program designed for efficient data gathering and analysis through web crawling using the TOR network. While the program is primarily designed to work seamlessly with TorBot, the API and CLI can also operate independently.

Status/Social Links

Go Open Source Helpers []() image

Features and Options

Main Arguments

TOR Integration

The program employs the TOR network for enhanced privacy and security during web crawling. TOR settings can be configured using environment variables or overridden using CLI flags.

REST API

Other options

Available Crawling Mechanisms

  1. Building Relationship Tree of Links: Generates a hierarchical tree of links, with child nodes representing links found on a website.
  2. Getting Tor Client IP: Retrieves the IP address of the current TOR client.
  3. Retrieving Phone Numbers: Collects phone numbers found on websites.
  4. Retrieving Emails: Gathers email addresses found on websites.

Example Usage

To start the HTTP server and initiate crawling, use the following command:

go run cmd/main/gotor.go -s

w/ alternate host and port for server and SOCKS5 proxy:

go run cmd/main/gotor.go -s -server-host 192.6.8.124 -server-port 8088 -socks5-host 127.0.0.1 -socks5-port 9051

To crawl directly using the CLI and output the results to an Excel file, use the following command:

go run cmd/main/gotor.go -url https://example.com -depth 2 -d

Running with Docker

To run the server using Docker, a convenience script build.sh is provided. This script builds a Docker network service for Tor and connects it to the "gotor" Docker container. Make sure no other service is using the same port. The script uses the SOCKS5_PORT.

To build and start the Docker containers:

./scripts/build.sh

To stop and destroy the Docker containers:

./scripts/destroy.sh

Documentation

This project includes comprehensive code comments to facilitate documentation generation with godoc. To generate and access documentation, use the following command:

godoc -v -http=:6060

This will make the documentation available at http://127.0.0.1:6060.

License

This project is licensed under the GNU General Public License.

Feel free to contribute, report issues, or suggest improvements!