pelican-plugins / search

Pelican plugin that adds site search capability
47 stars 9 forks source link

remove non printable chars from titles #31 #32

Open lioman opened 1 year ago

lioman commented 1 year ago

My initial idea is working only for the mentioned \u00ad char. I stripped the non-printable chars by replacing them by " "

Not sure if we need json.dumps and for what it was needed in the first place. If so, we should add a test case for that.

justinmayer commented 1 year ago

Thanks, Lioman. According to the description in #23, the json.dumps() method:

should handle any arbitrary punctuation marks which may happen to be in the Title - ",',\,*,...etc.

I just tried putting those characters in article titles, and I didn't have any problems with the existing code in main, except that I see a backslash before double-quotation marks in the search result titles. The escaping logic from #15 is adding a backslash where there shouldn't be one.

(I moved the rest of this comment to a more relevant issue.)