informatics-isi-edu / hatrac

Simple object storage for collaborations
Apache License 2.0
3 stars 1 forks source link

Bad url-encoding behaviors #42

Closed karlcz closed 11 months ago

karlcz commented 7 years ago

The current service implementation has flaws with the handling of url-encoded characters:

  1. Some reserved characters such as / and : are not allowed in namespace nor object names even when safely encoded by the client. This seems unfriendly as clients should be able to expect url-encoding to work consistently to protect any UTF-8 input they wish to embed in client-generated names.
  2. Other reserved characters such as & are allowed but are conflated with their encoded forms. For example, the two object names test&name.txt and test%26name.txt are not considered distinct by hatrac when they really should be according to the relevant RFCs.
  3. Some illegal characters are passed through Apache HTTPD and accepted by hatrac when they should be rejected, e.g. < is neither reserved nor non-reserved and should never appear in a valid URL.
  4. Degenerate cases like %00 are rejected but with unhelpful error messages.

To improve this, it seems we should make several related changes:

And optionally:

This last test would be unnecessary for correct hatrac service function or safety, but might improve client safety in situations where clients use decoded URL elements in contexts that are not well guarded against unusual values.

karlcz commented 7 years ago

@hongsudt @bugacov @robes @ljpearlman @svoinea any comments or concerns?

karlcz commented 11 months ago

The flask refactor has addressed these concerns.