rr- / szurubooru

Image board engine, Danbooru-style.
GNU General Public License v3.0
718 stars 182 forks source link

Accidentally added %0A ( \n newline) at the end of some tags #669

Open nocrcl opened 4 months ago

nocrcl commented 4 months ago

I made a bash script to mass upload images using the API and tag them by using a string containing the tags inside the metadata of the image. very hacky but it kinda works.

While working out how to trim whitespace characters, i accidentally created some tags with a newline character at the end. e.g. myszuruip/tag/cheese%0A When trying to click the tag in the tags view, it says "Requested path /tag/cheese was not found." Which makes it impossible to rename or delete such tags using the client. I tried to use curl to GET and on success PUT edit or deletethe tag, it converts the %0A into \n

curl -X GET -H "Authorization: Token blablabla=" \
    -H "Content-Type: application/json" \
    -H "Accept: application/json" \
    myszuruip/api/tag/cheese%0A
{
  "name": "ValidationError",
  "title": "Not Found",
  "description": "Requested path /tag/cheese\n was not found."
}

BTW: Is this considered a bug, if it is possible to add invalid characters through API?

BloodyRain2k commented 3 months ago

I'm not familiar poking at the API this way, but have you tried encoding the %0A so that the % is URI encoded? Aka myszuruip/api/tag/cheese%250A instead?

And I'd agree, there should be a sanity check for characters before adding tags. From a quick look, it seems like https://github.com/rr-/szurubooru/blob/d102578b544226207fac055d0c6ed1a45a12e471/server/szurubooru/func/tags.py#L182 would be a good start to implement one.

Currently I'm in the process of bending other parts of my instance into shape, but eventually this is something I'd want to have too. Because until now, I too wouldn't have expected illegal characters to go through uncaught.

G1org1owo commented 3 months ago

This is already addressed in https://github.com/rr-/szurubooru/blob/d102578b544226207fac055d0c6ed1a45a12e471/server/szurubooru/func/tags.py#L39-L44 which is called in the process of creating a new tag.

Assuming the tag regex was not modified, I am lead to believe that the illegal character in question was in fact not a newline, as it would have negated the expression ^[^\s%+#/]+$ and thus returned an error.

If you try to interrogate the API for the tag list e.g.

curl -X GET
       -H 'Accept: application/json' \
       -H 'Authorization: Token blablabla='
       myszuruip/api/tags?query=cheese*' 

what name does it return?