Open srossross opened 1 month ago
summary
author
license
requires
dev_status
first_release
last_release
releases
monthly_downloads_pypi
name
:version
: summary
:author
:license
: maintainer
: maintainer_email
:requires
:dev_status
:first_release
:last_release
:releases
:monthly_downloads_pypi
:monthly_downloads_conda
:python -m score.cli scrape-pypi-web --letter 0-9
Where scrape-pypi-web
is a call to the web scraper and --letter (optional)
specifies the range of letters that you would like to scrape
The output destination would be as follows:
./score/output/web/letter={letter}/pypi_packages.parquet
Scrape pypi for all the info needed in the research phase that can not be gotten from pypi json api
the dataset output format and location should be documented
This should be a new command like
python -m score.cli pypi-web