Character encoding & windows fixes

Sp3EdeR commented 2 years ago

Closes #205

Fixed the MPC-HC data encoding issue
Fixed the whitelist not matching paths on Windows

Sp3EdeR commented 2 years ago

Thanks for making the PR! I feel that this approach is a bit too hacky.

Ideally, the HTTP library (requests) should be doing the decoding. We shouldn't have to do it manually. I'll try to look into why this isn't working out of the box.

Parsing complex HTML using Regexes is a recipe for disaster. I'd prefer to use something like BeautifulSoup or the lxml package for this.

HTTP does not know HTML, since HTTP can transmit any number of documents. It is of course possible to add a full-on HTML parser library. I did write the regex to be standard-compliant though, so I would expect it to be much more robust than the existing <p> parser at this point. Since the script needs to deal with just a specific program's output only, perhaps this might still be the smallest impact.

Sp3EdeR commented 2 years ago

@iamkroot , I was thinking that perhaps it might be a good strategy to integrate this fix into the solution to provide a fix for non-English users at a short timeframe. Then subsequently implement an HTML parser library to provide a more robust solution. Then the maintenance overhead can be resolved as soon as possible while providing better user experience right away.

iamkroot commented 2 years ago

I'd prefer to keep this open for now, and only merge a proper solution. Considering that you are the first person to bring up this issue in over 4 years, I think it should be okay to not rush things :)

Sp3EdeR commented 10 months ago

I've updated the PR according to the review comments.

iamkroot / trakt-scrobbler

Character encoding & windows fixes #208