webrecorder / cdxj-indexer

CDXJ Indexing of WARC/ARCs
Apache License 2.0
21 stars 12 forks source link

'cgi' is deprecated and slated for removal in Python 3.13 #23

Closed benoit74 closed 1 week ago

benoit74 commented 8 months ago

Codebase needs to be adapted to cope with the fact the cgi is now deprecated since Python 3.11, and slated for removal in 3.13.

benoit74 commented 8 months ago

cgi is only used for POST/PUT requests handling.

Documentation at https://docs.python.org/3.11/library/cgi.html suggests to have a look in multipart PyPi package at https://pypi.org/project/multipart/:

Deprecated since version 3.11, will be removed in version 3.13: The cgi module is deprecated (see PEP 594 for details and alternatives).

The FieldStorage class can typically be replaced with urllib.parse.parse_qsl() for GET and HEAD requests, and the email.message module or multipart for POST and PUT. Most utility functions have replacements.

Looking at https://github.com/defnull/multipart seems promising:

Features

  • Parses multipart/form-data and application/x-www-form-urlencoded.
  • Produces useful error messages in 'strict'-mode.
  • Gracefully handle uploads of unknown size (missing Content-Length header).
  • Fast memory mapped files (io.BytesIO) for small uploads.
  • Temporary files on disk for big uploads.
  • Memory and disk resource limits to prevent DOS attacks.
  • Fixes many shortcomings and bugs of cgi.FieldStorage.
  • 100% test coverage.