Illustrated Technical Books PDF scrapper and downloader
- Download the dependencies:
pip install beautifulsoup4 requests
- Run pdf_books_scrapper.py file and the downloading will start
python3 pdf_books_scrapper
- Output PDFs will be stored in
output/books_pdf/
folder.
crops_webpages_pdf_scrapper
This is an under development scrapper repository where I am using python to scrap information from this link.
Steps to run the scrapper:
For PDF Output
- Open terminal and run this command to install all the required dependencies:
pip install -r requirements.txt
- Run main.py
python3 main.py
- Check the output pdf files for every crop in
output/rabi_crops_pdf
folder
For JSON Output
- Put your OpenAI API key in a .env file
- Run rabi_crops_scrapper.py
python3 python rabi_crops_scrapper.py
- Check the output pdf files for every crop in
output/rabi_crops_json
folder