claromes / volleystats

šŸ Command-line tool to scrape volleyball statistics from Data Project Web Competition websites
https://pypi.org/project/volleystats
GNU General Public License v3.0
11 stars 0 forks source link
data-project data-volley python scraping scrapy sports-data volleyball

Volley Stats

PyPI PyPI

Command-line tool to scrape volleyball statistics from Data Project Web Competition websites.

Volley Stats facilitates the export of data in CSV format of volleyball matches and competitions organized by entities that use Data Project WCM. The tool streamlines the collection of individual matches, match lists, and automates the retrieval of individual match data from the competition matches list.

Additionally, it documents the structure of URLs for Web Competition websites, simplifying the search for identifiers (mID, ID, PID), and also supplies acronyms for the main entities utilizing Data Project Management.

This tool is not affiliated with Genius Sports Italy.

Installation

Requirement

pip install volleystats

Documentation

Extracted Data

Usage

volleystats [--help] --fed FED (--match MATCH | --comp COMP | --batch CSV_FILE_PATH) [--pid PID] [--log]

Match

volleystats --fed FED --match MATCH

Examples

Competition Matches

volleystats --fed FED --comp COMP

Example

Competition Matches with PID

In some competitions, PID can be used to distinguish between seasons, such as regular season and playoffs. Therefore, it is necessary to submit this value to obtain statistics separately.

volleystats --fed FED --comp COMP --pid PID

Examples

Matches via Competition Matches file

volleystats --fed FED --batch CSV_FILE_PATH

Example

Help

volleystats --help

Log

volleystats --fed FED (--match MATCH | --comp COMP | --batch CSV_FILE_PATH) --log

Output messages

                    .
                    |`.
                    |  `.
                    |-_  `.
                    |  -_  `._
____________________|____-_ _|_______________,
',                         -_|                ',
  ',                         |                  ',
    ',                       |                    ',
      ',_____________________|______________________',

volleystats: started
volleystats: data/cbv-1623-22-10-28-home-fluminense.csv file was created
volleystats: data/cbv-1623-22-10-28-guest-baruerivolleyballclub.csv file was created
volleystats: finished

Data Project Web Competition URLs structure

Federations, Confederations and Leagues Acronyms

European Volleyball

South American Volleyball

Troubleshooting

Match files collected from batch file

In some cases, empty files may be returned, usually named as <fed_acronym>-<match_id>-guest_stats.csv and <fed_acronym>-<match_id>-home_stats.csv. This can happen due to the hiding of a match in the competition listing, either because it was canceled or incorrectly entered. The match is hidden from view, but it remains accessible in the HTML, causing the tool to return an empty file. In such cases, simply ignore and delete this file.

It can also happen that the data is only available in PDF, which makes scraping impossible.

Development

$ git clone git@github.com:claromes/volleystats.git

$ cd volleystats

$ pip install -r requirements.txt

$ pip install --editable .

Author

Claromes