RFO-BASIC / Basic

The Repository for the files the Basic project that creates the BASIC! APK for Google Play
62 stars 42 forks source link

UTF-8/16 string conversions #148

Open jMarcS opened 10 years ago

jMarcS commented 10 years ago

Forum user karamell asked here about adding functions to convert byte data into a string and back again. The application is ID3v2 tag files, which are binary files with embedded strings.

Java has conversion functions, of course, but I think there are a lot of end cases and error conditions to figure out.

jMarcS commented 9 years ago

Tabled for a long time, this came back to life when users started asked about URLEncode/Decode again. I've added ENCODE$() and DECODE$() functions. I threw in BASE64, requested a couple of times before, and duplicates of the ENCODE and DECODE commands, which probably should have been functions in the first place.

ENCODE$("ENCRYPT", {password}, string)
ENCODE$("URL", {charset}, string)
ENCODE$("BASE64", {charset}, string)
ENCODE$(charset, string)

Same things for DECODE$(), of course. "DECRYPT" is an alias for "ENCRYPT", in case a user thinks DECODE$("ENCRYPT") feels wrong. The types are not case sensitive: "encode" or "Base64" is fine.

All of the encodes go from a standard BASIC! string to a buffer string. A buffer string is how Paul solved the problem of reading binary files: each character holds one byte of data, the high byte gets a pad of zero. The charset (UTF-8 by default) defines how the original string is written to a byte array, and the byte array is written to the the buffer string, one byte per character.

All of the decodes go from a buffer string to a standard BASIC! string. The byte array is built from the low-order bytes of the characters of the buffer string. The charset (UTF-8 by default) describes the encoding of the bytes in the byte array. Java builds a string out of the byte array, decoding the bytes according to the charset, and writing each resulting character into a standard BASIC! string, which is a Java String (UTF-16).

The manual version of this description goes on for two and a half pages, so this is enough description for this venue.