Closed billdenney closed 4 years ago
Sure:
> stringi::stri_trans_general("²", "nfkd;nfc;Latin-ASCII")
[1] "2"
Fun fact: the ASCII \032
is the SUBSTITUTE CHARACTER, a kind of NA, but for individual code points.
Thanks! I read for a bit about nfd, nfc, nfkd, and nfkc, and I'm not sure that I understand more, but I do understand that these appear to be what is needed for this case.
This one gives a good overview IMO https://www.unicode.org/reports/tr15/
Thanks to the pointer for the overview of the normalizers and for the info about character 32.
Related to sfirke/janitor#389
When trying to translate extended ASCII to printable ASCII (as expected with the
stri_enc_toascii()
function), I expected the superscript 2 character to convert to either "2" (or perhaps preferably "^2"), but it was converted to something else.Is there another function or method in stringi that will translate or transliterate extended ASCII to printable ASCII?
Created on 2020-07-21 by the reprex package (v0.3.0)