vfedotovs / sslv_web_scraper

ss.lv web scraping app helps automate information scraping and filtering from classifieds and emails results and stores scraped data in database
GNU General Public License v3.0
5 stars 3 forks source link
analytics beautifulsoup4 docker email email-sender fpdf-library pandas-library postgresql python requests scraper sendgrid-api webscraping

SS.LV Web Scraper

Test, Build and Deploy CI (https://github.com/vfedotovs/sslv_web_scraper/actions/workflows/CI.yml)
Coverage codecov
Embark on an exploration of Ogre City apartments for sale historical data here http://propertydata.lv/

About application:

Purpose: This application will scrape daily ss.lv website from apartments for sale category in specific city of your choice and store scraped data in postgres database and will send daily email with report.

Requirements

# docker -v                                                                 
Docker version 20.10.11, build dea9396

# docker-compose -v                                                                  
Docker Compose version v2.2.1

How to use application:

  1. Clone repository
  2. Create database.ini here is example
    
    [postgresql]
    host=<your docker db hostname>
    database=<your db name>
    user=<your db username>
    password=<your db password>
3. Create .env.prod file for docker compose
```bash                                      
# ws_worker container envs
DEST_EMAIL=user@example.com
SENDGRID_API_KEY=<Your SENDGRID API Key>
SRC_EMAIL=user@example.com
POSTGRES_PASSWORD=<Your DB Password>
  1. Run docker-compose --env-file .env.prod up -d

Use make

make                                                                          
help                 💬 This help message
all                  runs setup, build and up targets
setup                gets database.ini and .env.prod and dowloads last DB bacukp file
build                builds all containers
up                   starts all containers
down                 stops all containers
clean                removes setup and DB files and folders
lt                   Lists tables sizes in postgres docker allows to test if DB dump was restored correctly

Currently available features

Worok in progress: