Closed davidism closed 3 weeks ago
https://sourcegraph.com/search?q=context:global+lang:Python+werkzeug+OrderedMultiDict+-file:venv+-file:site-packages+-file:werkzeug+-repo:pallets/werkzeug+&patternType=keyword&sm=0 shows ~24 repos using OrderedMultiDict
. Browsing through them, most uses do not seem necessary.
I did see a couple cases where knowing the exact order that request.args
or request.form
was parsed in was important. But in that case, it's more straightforward to compare the exact data sent rather than the parsed version:
request.query_string
is the exact value from the URL that gets parsed into request.args
request.get_data()
called before accessing request.form
returns the exact request body body@ThiefMaster Indico uses it in a few places: https://github.com/search?q=repo%3Aindico%2Findico%20orderedmultidict&type=code, thoughts? It's not clear that it's needed.
I think parameter_storage_class = ImmutableOrderedMultiDict
(on the Request subclass) is the most interesting one, because I added that one due to how crappy the PayPal webhook validation is (they require you to send back data you received in the exact same order).
IIRC there was either something in the docs on on some SO post recommending this, so there may be more people using it when they have a paypal integration in their flask app.
request.get_data()
called before accessing request.form returns the exact request body body
FWIW depending on the architecture of an application (especially when plugins are involved) it may be hard to ensure that gets called before anything else accesses the form data.
The other two places where it's used are most likely leftovers from before dict
was guaranteed to be ordered by insertion order, so if the standard MultiDict already does that now as well it's enough.
PayPal IPN was one use case I saw, seems like a weird API design. The other use case I found seemed to be trying to check that a URL was canonical based not only on the query args but their exact order in the URL.
I know it's not great to have to change it, but doing something like the following would probably be enough for the PayPal case, if an app can't guarantee that the data is cached first. The query args case could probably just look at query_string
or parse_qsl(query_string)
instead.
class StoredImmutableMultiDict(ImmutableMultiDict):
def __init__(self, mapping):
assert isinstance(mapping, list)
self.original_items: list[tuple[str, str]] = mapping
super().__init__(mapping)
parameter_storage_class = StoredImmutableMultiDict
Then you can access form.original_items
if needed. Even seems preferable to OrderedMultiDict
, which is slower and uses even more memory.
I did see that Werkzeug's documentation for parameter_storage_class
mentions ImmutableOrdereMultiDict
as an option, so that's probably where the advice came from.
FWIW I don't think paypal has any multi-value keys, so probably a normal MultiDict is enough nowadays...
Oh yeah, that level of ordering only matters if there are actually multiple values per key, and they're submitted with keys interleaved. Seems exceedingly unlikely.
OrderedMultiDict
is not used by Werkzeug. It has a complex implementation, and is documented to be slow.dict
maintains key insertion order in all supported versions of Python.MultiDict
already maintains the insertion order of values within a key.OrderedMultiDict
maintains the insertion order of each item, so[(a, 1), (b, 2), (a, 3)]
comes out the same instead of[(a, 1), (a, 3), (b, 2)]
. It's not clear there's a use case for this, searching sourcegraph shows some use ofOrderedMultiDict
but it's not clear if the use is actually required vs usingMultiDict
. Other libraries such as boltons also provide implementations. Mark this as deprecated and see if anything comes up.