xwmx / nb

CLI and local web plain text note‑taking, bookmarking, and archiving with linking, tagging, filtering, search, Git versioning & syncing, Pandoc conversion, + more, in a single portable script.
https://xwmx.github.io/nb
GNU Affero General Public License v3.0
6.69k stars 189 forks source link

Awk multibyte issue with some versions of MacOS awk #248

Closed felixdv closed 1 year ago

felixdv commented 1 year ago

This is not so much a bug report on nb but rather, something I noticed while adding some urls with multibyte character content, in combination with using the MacOS-provided version of awk. Perhaps this can help someone else if they're running into the same issue.

Trying to bookmark some urls gave me the following error:

$ nb https://groups.csail.mit.edu/mac/classes/6.805/articles/crypto/cypherpunks/cyphernomicon/CP-FAQ
awk: towc: multibyte conversion failure on: '�re (with Caesar)'

 input record number 3113, file
 source line number 1

This is with the MacOS-provided awk, which currently has the following version (on Ventura 13.3):

$ awk --version
awk version 20200816

However, after installing the brew-provided version of awk and reloading my shell, I get a newer awk and the bookmark imports without a problem:

$ brew install awk
$ source ~/.zshrc
$ awk --version
awk version 20211208

$ nb https://groups.csail.mit.edu/mac/classes/6.805/articles/crypto/cypherpunks/cyphernomicon/CP-FAQ
Added: [527] 🔖 20230429105854.bookmark.md "(groups.csail.mit.edu)"

Tested on nb version 7.5.0

xwmx commented 1 year ago

I updated it to convert the content to UTF-8 before attempting to extract a title and it seems to be working as of version 7.5.1. I tested it with iconv (GNU libiconv 1.11), which appears to come with macOS or the Xcode command line tools. @felixdv If you can, please me know if it's resolved on your end in nb 7.5.1.

felixdv commented 1 year ago

@xwmx I have just tested it with nb 7.5.1 and awk version 20200816 and can confirm it is fixed. Thanks!

xwmx commented 1 year ago

@felixdv Awesome. Thanks!