sethmlarson / pypi-data

Data about packages and maintainers on PyPI
Apache License 2.0
122 stars 8 forks source link

Latest release doesn't seem to match the schemas in main.py? #12

Closed jcrist closed 2 years ago

jcrist commented 2 years ago

I'm not sure if an old file was uploaded, or the code in main.py wasn't used, but the schemas for the tables in https://github.com/sethmlarson/pypi-data/releases/download/2022.09.12/pypi.db.gz match the ones from before #11.

Browsing the latest release:

$ sqlite3 pypi.db 
SQLite version 3.39.0 2022-06-25 14:57:57
Enter ".help" for usage hints.
sqlite> .schema packages
CREATE TABLE packages (
    name STRING,
    version STRING,
    requires_python STRING,
    yanked BOOLEAN DEFAULT FALSE,
    has_binary_wheel BOOLEAN,
    uploaded_at TIMESTAMP,
    recorded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    downloads INTEGER,
    scorecard_overall FLOAT,
    PRIMARY KEY (name)
  );
CREATE INDEX idx_packages_name ON packages (name);

Based on the code in main.py, I'd expect name and string to now be of type TEXT rather than STRING (which isn't a SQLite type).

sethmlarson commented 2 years ago

Maybe I forgot to pull latest, I'll look into this tomorrow.

sethmlarson commented 2 years ago

New release is available that matches schema: https://github.com/sethmlarson/pypi-data/releases/tag/2022.10.12