Novik / ruTorrent

Yet another web front-end for rTorrent
Other
2.01k stars 408 forks source link

Bug with UTF-8 encoding on labels and AutoTools #2520

Open phene opened 1 year ago

phene commented 1 year ago

Please complete the following tasks.

Tell us about your environment

Web Browser: Firefox 113.0.2 (64-bit) (also occurs in Chrome) ruTorrent - v4.0-stable PHP: PHP 8.1.7 OS: Ubuntu 22.10

Tell us how you installed ruTorrent

not relevent

Describe the bug

Re-using an existing label with UTF-8 characters causes the string to be re-encoded incorrectly. This combined with AutoTools causes the folder that the data is written into to change.

Steps to reproduce

  1. Configure AutoTools to move files into a folder based on the label name
  2. Add torrent and assign the label 엄마.
  3. Observe that the file is saved into a folder called 엄마
    "엄마".bytes
    => [236, 151, 132, 235, 167, 136]
  4. Add a new torrent and assign the existing 엄마 label to it.
  5. Observe that the new file is saved into a folder called 엄마, with the following bytes.
    "엄마".bytes
    => [225, 132, 139, 225, 133, 165, 225, 134, 183, 225, 132, 134, 225, 133, 161]
  6. Observe that the formatting of the Korean characters is not correct. bad-korean

Note: the oddly encoded string does not survive copy-paste into the browser, but does in and out of a terminal.

Expected behavior

I would expect the string for the label to stay the same in order to consistently put the content in the same folder.

Additional context

No response

stickz commented 1 year ago

Is this bug reproduceable on version 4.1.6? There were some changes made to labels in version 4.1.

As a side note, the latest v4.1.6 branch release is significantly more stable than v4.0. It's undesirable to still be using v4.0.

phene commented 1 year ago

I've reproduced this in 4.1.6.

stickz commented 1 year ago

It looks like the UTF-8 encoding is being lost. I will implement a new utility function to convert back to UTF-8. Target is v4.2.

    function url_encode($string){
        return rawurlencode(mb_convert_encoding($string, "UTF-8", "auto"));
    }

    function url_decode($string){
        return mb_convert_encoding(rawurldecode($string), "UTF-8", "auto"));
    }
stickz commented 1 year ago

I've reproduced this in 4.1.6.

Are you able to reproduce this bug in v4.2.2? The autotools plugin converts to UTF-8 now, when encoding and decoding URLs.

stale[bot] commented 5 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.