erengy / taiga

A lightweight anime tracker for Windows
https://taiga.moe
GNU General Public License v3.0
2.11k stars 226 forks source link

1.4 Beta builds broke RSS feeds with specific characters #1021

Open Asinin3 opened 3 years ago

Asinin3 commented 3 years ago

In 1.3.1, a RSS feed such as: https://nyaa.si/?page=rss&c=1_2&f=1&q=%title%+-[SubsPlease] Would work just fine in hiding results from the specified group. However, 1.4 Beta builds broke this functionality. Specifically the: +-[SubsPlease] was what caused the problem. And the way to resolve it was by encoding that section of the url so it would turn into: %2B-%5BSubsPlease%5D Which would be used like this: https://nyaa.si/?page=rss&c=1_2&f=1&q=%title%%2B-%5BSubsPlease%5D

I'm guessing something changed internally and Taiga no-longer encodes the RSS url itself.

erengy commented 2 years ago

It's because v1.4 includes a new HTTP library that conforms to RFC 3986. [ and ] are reserved characters that are not allowed in the query component of a URL. Previous versions worked only by chance by naively decoding and re-encoding the components while parsing a URL string.

The subject is actually quite complicated, as there is no single definition of what a URL is. AFAIK most implementations try to conform to RFC 3986, while web browsers are overall more lenient in parsing. WHATWG's URL standard, along with a bunch of other "living standards", describe what some web browsers do. From what I understand, they don't apply well to other applications.

Asinin3 commented 2 years ago

Can taiga automatically encode the URL instead then?

wopian commented 2 years ago

As per RFC 3986 2.1 and 6.2.2.2, HTTP libraries must percent encode URI component octets whose character is outside of the allowed character sets.

So to fully conform to 3986, [ and ] must always be encoded by the HTTP library (as browsers do too). Their percent encoded equivalents are perfectly valid characters and have widespread usage in APIs

Asinin3 commented 7 months ago

What I mean is, can the field that you type in be normal plaintext, but then in the background Taiga just encodes it. It would make it far simpler for users.