abuneri / utf8

UTF-8 text encoding and decoding library
BSD 2-Clause "Simplified" License
0 stars 0 forks source link
text unicode utf-8

(A)buneris (U)ni(c)ode: UTF-8

A simple UTF-8 encoded text type, mostly for learning purposes

Features

Implementation References

Installation

Build and install project

mkdir build
cd build

# Without tests
cmake .. -DBUILD_TESTING=OFF
cmake --build . --config Release

# With tests
cmake ..
cmake --build . --config Release
ctest

cmake --install . --config Release

Include package in your own CMake projects

# Your projects CMakeLists.txt

find_package(auc REQUIRED)

# ... configure <your_target> ...

target_link_libraries(<your_target>
    auc
)

Example

#include <auc/u8text.hpp>

int main() {
  auc::u8text text(u8"Ī咩鉼歺и(尤ۼñ>w");
  if (text.is_valid()) {
    for (const auc::codepoint& cp : text.get_codepoints()) {
        // ... codepoint-wise operations ...
    }

    for (const auc::graphemecluster& gc : text.get_grapheme_clusters()) {
        // ... grapheme cluster-wise operations ...
    }
  }
}