Since anyone can upload a package to PyPI, malicious users might upload malware, which would then harm users. To mitigate this risk, PSF previously obtained funding to add some malware detection in Warehouse in late 2019, but the goals for the relevant milestone were more ambitious than funding allowed for. The malware detection system is currently in limbo: an interesting prototype with limited practical impact because of the astounding number of false-positives. To protect users from malware, we still need to:
Make Malware Verdicts Auditable - Right now, verdicts are removed once the associated project/package is removed. We need to change the backend to retain past verdicts so we can evaluate and improve the efficacy of this system.
Implement a more robust malware detector - One current check relies on simple pattern matching with YARA. A better approach requires parsing the package code into an AST.
Release a canonical dataset for developers to write their checks against, and iterate on to improve detection accuracy.
We also want to set up a partnership with VirusTotal or a similar third-party virus checking service during the check development to scan every uploaded package. Integration with a third-party virus scanner is low-hanging fruit that could move the needle on PyPI package security.
Funding would be used for backend development, security engineering, project management, system administration, and publicity to stakeholders. Ideally, AV integrations would be donated by the vendors.
Since anyone can upload a package to PyPI, malicious users might upload malware, which would then harm users. To mitigate this risk, PSF previously obtained funding to add some malware detection in Warehouse in late 2019, but the goals for the relevant milestone were more ambitious than funding allowed for. The malware detection system is currently in limbo: an interesting prototype with limited practical impact because of the astounding number of false-positives. To protect users from malware, we still need to:
We also want to set up a partnership with VirusTotal or a similar third-party virus checking service during the check development to scan every uploaded package. Integration with a third-party virus scanner is low-hanging fruit that could move the needle on PyPI package security.
Funding would be used for backend development, security engineering, project management, system administration, and publicity to stakeholders. Ideally, AV integrations would be donated by the vendors.