aecreations / deliciouspost

0 stars 0 forks source link

Not all special characters are being handled properly #3

Open aecreations opened 10 years ago

aecreations commented 10 years ago

Not all special characters are being properly handled by the double encoding that was implemented in version 1.5.1.

The one example so far is the single quote ("’" - note that this the curly quote, NOT the straight quote), such as the web page title in the following URL: http://www.salon.com/2014/06/09/cbs_news_huge_fatal_disaster_why_heads_need_to_roll_at_the_highest_levels/

aecreations commented 9 years ago

Also need to fix for the elipses character "…"

Example: http://water.epa.gov/action/advisories/acanthamoeba/index.cfm

aecreations commented 9 years ago

I thought I had fixed the right-pointing angle quote "»", but this character still isn't being encoded properly. Example (page title): http://www.urbanvillagemovement.com/

After some investigation, I found that the double angle brackets that I'm using to replace it is causing the problem:

var s = decodeURIComponent("The Urban Village Movement %3E%3E Creating Community");
print(s);
The Urban Village Movement >> Creating Community
aecreations commented 9 years ago

Need to handle the registered trademark ("®") symbol. Might also need to fix the trademark symbol ("™") as well.

Example: https://software.intel.com/en-us/realsense

aecreations commented 9 years ago

The degree sign ("°", 0x00B0) isn't being handled properly, either. It ends up as being saved as "%C2%B0"