DesiQuintans / tsv2label

tsv2label: Label, describe, rename, and recode datasets using a data dictionary
Other
2 stars 0 forks source link

Safer handling of 'illegal' characters in TSV input. #16

Open DesiQuintans opened 9 months ago

DesiQuintans commented 9 months ago

A few times now, I've gotten hard-to-debug errors caused by extended characters in column renames and variable descriptions. The most recent one was caused by in a variable label that caused describe_with_dictionary() to fail.

TSV input should always be converted. Potential column renames should probably be passed through iconv(c(...), to = "ASCII//TRANSLIT"). Potential variable labels and factor levels should probably be passed through iconv(c(...), to = "ASCII", sub = "c99"). The latter converts extended characters into "\\u00b0" codes.