webrecorder / pywb

Core Python Web Archiving Toolkit for replay and recording of web archives
https://pypi.python.org/pypi/pywb
GNU General Public License v3.0
1.42k stars 217 forks source link

Use native JSON values in query string for POST canonicalization #893

Closed tw4l closed 2 weeks ago

tw4l commented 8 months ago

Fixes #859

Description

Modifies the output of POST canonicalization query strings in pywb to use native JSON values for booleans, numbers, and null, rather than a string representation of their Python values.

This commit also adds a more complicated JSON test case that is also in warcio.js to ensure parity.

We now handle numbers like JavaScript's Number.prototype.toString() by dropping decimal from floats if they represent whole number, to ensure consistency between pywb and warcio.js.

Motivation and context

This is part of a cross-repo effort to standardize how POST canonicalization works in Webrecorder tools, and document this in a Webrecorder specfiication.

Testing notes

Seems to be working well with fuzzy matching on replay, but could use some additional testing.

Types of changes

Checklist: