webrecorder / pywb

Core Python Web Archiving Toolkit for replay and recording of web archives
https://pypi.python.org/pypi/pywb
GNU General Public License v3.0
1.34k stars 207 forks source link

Canonicalize non-GET URLs with native JSON values #859

Open tw4l opened 11 months ago

tw4l commented 11 months ago

The way that pywb rewrites URLs for POST (and other non-GET) request canonicalization ends up writing Pythonic values into the URL such as True, False, and None, whereas we ideally want to have it use valid JSON values.

This is part of making POST canonicalization consistent across Webrecorder tools. Related to https://github.com/webrecorder/specs/issues/141

tw4l commented 11 months ago

We will also need to ensure that the fuzzy matching in pywb and wabac.js works with previously-created URLs created by pwyb, or develop a process to reindex as necessary.