ZNClub-PA-ML-AI / Scrapy-Spiders

Web Crawling using Scrapy
3 stars 5 forks source link
news python-3 scrapy stock-market web-crawling

  sSSs    sSSs   .S_sSSs     .S_SSSs     .S_sSSs     .S S.     sSSs   .S_sSSs     .S   .S_sSSs      sSSs   .S_sSSs      sSSs  
 d%%SP   d%%SP  .SS~YS%%b   .SS~SSSSS   .SS~YS%%b   .SS SS.   d%%SP  .SS~YS%%b   .SS  .SS~YS%%b    d%%SP  .SS~YS%%b    d%%SP  
d%S'    d%S'    S%S   `S%b  S%S   SSSS  S%S   `S%b  S%S S%S  d%S'    S%S   `S%b  S%S  S%S   `S%b  d%S'    S%S   `S%b  d%S'    
S%|     S%S     S%S    S%S  S%S    S%S  S%S    S%S  S%S S%S  S%|     S%S    S%S  S%S  S%S    S%S  S%S     S%S    S%S  S%|     
S&S     S&S     S%S    d*S  S%S SSSS%S  S%S    d*S  S%S S%S  S&S     S%S    d*S  S&S  S%S    S&S  S&S     S%S    d*S  S&S     
Y&Ss    S&S     S&S   .S*S  S&S  SSS%S  S&S   .S*S   SS SS   Y&Ss    S&S   .S*S  S&S  S&S    S&S  S&S_Ss  S&S   .S*S  Y&Ss    
`S&&S   S&S     S&S_sdSSS   S&S    S&S  S&S_sdSSS     S S    `S&&S   S&S_sdSSS   S&S  S&S    S&S  S&S~SP  S&S_sdSSS   `S&&S   
  `S*S  S&S     S&S~YSY%b   S&S    S&S  S&S~YSSY      SSS      `S*S  S&S~YSSY    S&S  S&S    S&S  S&S     S&S~YSY%b     `S*S  
   l*S  S*b     S*S   `S%b  S*S    S&S  S*S           S*S       l*S  S*S         S*S  S*S    d*S  S*b     S*S   `S%b     l*S  
  .S*P  S*S.    S*S    S%S  S*S    S*S  S*S           S*S      .S*P  S*S         S*S  S*S   .S*S  S*S.    S*S    S%S    .S*P  
sSS*S    SSSbs  S*S    S&S  S*S    S*S  S*S           S*S    sSS*S   S*S         S*S  S*S_sdSSS    SSSbs  S*S    S&S  sSS*S   
YSS'      YSSP  S*S    SSS  SSS    S*S  S*S           S*S    YSS'    S*S         S*S  SSS~YSSY      YSSP  S*S    SSS  YSS'    
                SP                 SP   SP            SP             SP          SP                       SP                  
                Y                  Y    Y             Y              Y           Y                        Y                   

Scrapy-Spiders

Scrapy module - Web Crawling

Build Status Coverage Status

Introduction

Scrapy is an application framework for crawling web sites and extracting structured data which can be used for a wide range of useful applications, like data mining, information processing or historical archival.

Even though Scrapy was originally designed for web scraping, it can also be used to extract data using APIs (such as Amazon Associates Web Services) or as a general purpose web crawler.

Architecture

Architecture

Visit here for more

Project Structure

tutorial/ scrapy.cfg # deploy configuration file

tutorial/             # project's Python module, you'll import your code from here
    __init__.py

    items.py          # project items file

    pipelines.py      # project pipelines file

    settings.py       # project settings file

    spiders/          # a directory where you'll later put your spiders
        __init__.py

Features

Getting Started with Scrapy

Commands

scrapy startproject project_name
scrapy crawl spider_name
scrapy crawl spider_name -o file.csv -t csv
scrapy crawl spider_name -o file.json -t json
scrapy shell "url"

Environment

Using conda environments

conda env list # list all environments
conda activate Scrapy # if Scrapy is listed
conda deactivate # deactivate current environment
conda create --name Scrapy # if Scrapy is NOT listed
pip install -r requirements.txt # install dependencies
conda list # all packages in env

Resources