It is written in PHP (PHP 7+) and can work without "mbstring", "iconv" or any other extra encoding php-extension on your server.
The benefit of Portable ASCII is that it is easy to use, easy to bundle.
The project based on ...
If you like a more Object Oriented Way to edit strings, then you can take a look at voku/Stringy, it's a fork of "danielstjules/Stringy" but it used the "Portable ASCII"-Class and some extra methods.
// Portable ASCII
use voku\helper\ASCII;
ASCII::to_transliterate('déjà σσς iıii'); // 'deja sss iiii'
// voku/Stringy
use Stringy\Stringy as S;
$stringy = S::create('déjà σσς iıii');
$stringy->toTransliterate(); // 'deja sss iiii'
composer require voku/portable-ascii
I need ASCII char handling in different classes and before I added this functions into "Portable UTF-8", but this repo is more modular and portable, because it has no dependencies.
Example: ASCII::to_ascii()
echo ASCII::to_ascii('�Düsseldorf�', 'de');
// will output
// Duesseldorf
echo ASCII::to_ascii('�Düsseldorf�', 'en');
// will output
// Dusseldorf
The API from the "ASCII"-Class is written as small static methods.
$array = ASCII::charsArray();
var_dump($array['ru']['б']); // 'b'
**Parameters:**
- `bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
` **Return:** - `array` -------- #### charsArrayWithMultiLanguageValues(bool $replace_extra_symbols): array ↑ Returns an replacement array for ASCII methods with a mix of multiple languages. EXAMPLE:
$array = ASCII::charsArrayWithMultiLanguageValues();
var_dump($array['b']); // ['β', 'б', 'ဗ', 'ბ', 'ب']
**Parameters:**
- `bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
` **Return:** - `arrayAn array of replacements.
` -------- #### charsArrayWithOneLanguage(string $language, bool $replace_extra_symbols, bool $asOrigReplaceArray): array ↑ Returns an replacement array for ASCII methods with one language. For example, German will map 'ä' to 'ae', while other languages will simply return e.g. 'a'. EXAMPLE:
$array = ASCII::charsArrayWithOneLanguage('ru');
$tmpKey = \array_search('yo', $array['replace']);
echo $array['orig'][$tmpKey]; // 'ё'
**Parameters:**
- `ASCII::* $language [optional] Language of the source string e.g.: en, de_at, or de-ch. (default is 'en') | ASCII::*_LANGUAGE_CODE
` - `bool $replace_extra_symbols [optional]Add some more replacements e.g. "£" with " pound ".
` - `bool $asOrigReplaceArray [optional]TRUE === return {orig: string[], replace: string[]} array
` **Return:** - `arrayAn array of replacements.
` -------- #### charsArrayWithSingleLanguageValues(bool $replace_extra_symbols, bool $asOrigReplaceArray): array ↑ Returns an replacement array for ASCII methods with multiple languages. EXAMPLE:
$array = ASCII::charsArrayWithSingleLanguageValues();
$tmpKey = \array_search('hnaik', $array['replace']);
echo $array['orig'][$tmpKey]; // '၌'
**Parameters:**
- `bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
` - `bool $asOrigReplaceArray [optional]TRUE === return {orig: string[], replace: string[]} array
` **Return:** - `arrayAn array of replacements.
` -------- #### clean(string $str, bool $normalize_whitespace, bool $keep_non_breaking_space, bool $normalize_msword, bool $remove_invisible_characters): string ↑ Accepts a string and removes all non-UTF-8 characters from it + extras if needed. **Parameters:** - `string $strThe string to be sanitized.
` - `bool $normalize_whitespace [optional]Set to true, if you need to normalize the whitespace.
` - `bool $keep_non_breaking_space [optional]Set to true, to keep non-breaking-spaces, in combination with $normalize_whitespace
` - `bool $normalize_msword [optional]Set to true, if you need to normalize MS Word chars e.g.: "…" => "..."
` - `bool $remove_invisible_characters [optional]Set to false, if you not want to remove invisible characters e.g.: "\0"
` **Return:** - `stringA clean UTF-8 string.
` -------- #### getAllLanguages(): string[] ↑ Get all languages from the constants "ASCII::.*LANGUAGE_CODE". **Parameters:** __nothing__ **Return:** - `string[]` -------- #### is_ascii(string $str): bool ↑ Checks if a string is 7 bit ASCII. EXAMPLE:
ASCII::is_ascii('白'); // false
**Parameters:**
- `string $str The string to check.
` **Return:** - `bool
true if it is ASCII
false otherwise
ASCII::normalize_msword('„Abcdef…”'); // '"Abcdef..."'
**Parameters:**
- `string $str The string to be normalized.
` **Return:** - `stringA string with normalized characters for commonly used chars in Word documents.
` -------- #### normalize_whitespace(string $str, bool $keepNonBreakingSpace, bool $keepBidiUnicodeControls, bool $normalize_control_characters): string ↑ Normalize the whitespace. EXAMPLE:
ASCII::normalize_whitespace("abc-\xc2\xa0-öäü-\xe2\x80\xaf-\xE2\x80\xAC", true); // "abc-\xc2\xa0-öäü- -"
**Parameters:**
- `string $str The string to be normalized.
` - `bool $keepNonBreakingSpace [optional]Set to true, to keep non-breaking-spaces.
` - `bool $keepBidiUnicodeControls [optional]Set to true, to keep non-printable (for the web) bidirectional text chars.
` - `bool $normalize_control_characters [optional]Set to true, to convert e.g. LINE-, PARAGRAPH-SEPARATOR with "\n" and LINE TABULATION with "\t".
` **Return:** - `stringA string with normalized whitespace.
` -------- #### remove_invisible_characters(string $str, bool $url_encoded, string $replacement, bool $keep_basic_control_characters): string ↑ Remove invisible characters from a string. e.g.: This prevents sandwiching null characters between ascii characters, like Java\0script. copy&past from https://github.com/bcit-ci/CodeIgniter/blob/develop/system/core/Common.php **Parameters:** - `string $str` - `bool $url_encoded` - `string $replacement` - `bool $keep_basic_control_characters` **Return:** - `string` -------- #### to_ascii(string $str, string $language, bool $remove_unsupported_chars, bool $replace_extra_symbols, bool $use_transliterate, bool|null $replace_single_chars_only): string ↑ Returns an ASCII version of the string. A set of non-ASCII characters are replaced with their closest ASCII counterparts, and the rest are removed by default. The language or locale of the source string can be supplied for language-specific transliteration in any of the following formats: en, en_GB, or en-GB. For example, passing "de" results in "äöü" mapping to "aeoeue" rather than "aou" as in other languages. EXAMPLE:
ASCII::to_ascii('�Düsseldorf�', 'en'); // Dusseldorf
**Parameters:**
- `string $str The input string.
` - `ASCII::* $language [optional]Language of the source string. (default is 'en') | ASCII::*_LANGUAGE_CODE
` - `bool $remove_unsupported_chars [optional]Whether or not to remove the unsupported characters.
` - `bool $replace_extra_symbols [optional]Add some more replacements e.g. "£" with " pound ".
` - `bool $use_transliterate [optional]Use ASCII::to_transliterate() for unknown chars.
` - `bool|null $replace_single_chars_only [optional]Single char replacement is better for the performance, but some languages need to replace more then one char at the same time. | NULL === auto-setting, depended on the language
` **Return:** - `stringA string that contains only ASCII characters.
` -------- #### to_ascii_remap(string $str1, string $str2): string[] ↑ WARNING: This method will return broken characters and is only for special cases. Convert two UTF-8 encoded string to a single-byte strings suitable for functions that need the same string length after the conversion. The function simply uses (and updates) a tailored dynamic encoding (in/out map parameter) where non-ascii characters are remapped to the range [128-255] in order of appearance. **Parameters:** - `string $str1` - `string $str2` **Return:** - `string[]` -------- #### to_filename(string $str, bool $use_transliterate, string $fallback_char): string ↑ Convert given string to safe filename (and keep string case). EXAMPLE:
ASCII::to_filename('שדגשדג.png', true)); // 'shdgshdg.png'
**Parameters:**
- `string $str`
- `bool $use_transliterate ASCII::to_transliterate() is used by default - unsafe characters are simply replaced with hyphen otherwise.
` - `string $fallback_char` **Return:** - `stringA string that contains only safe characters for a filename.
` -------- #### to_slugify(string $str, string $separator, string $language, string[] $replacements, bool $replace_extra_symbols, bool $use_str_to_lower, bool $use_transliterate): string ↑ Converts the string into an URL slug. This includes replacing non-ASCII characters with their closest ASCII equivalents, removing remaining non-ASCII and non-alphanumeric characters, and replacing whitespace with $separator. The separator defaults to a single dash, and the string is also converted to lowercase. The language of the source string can also be supplied for language-specific transliteration. **Parameters:** - `string $str` - `string $separator [optional]The string used to replace whitespace.
` - `ASCII::* $language [optional]Language of the source string. (default is 'en') | ASCII::*_LANGUAGE_CODE
` - `arrayA map of replaceable strings.
` - `bool $replace_extra_symbols [optional]Add some more replacements e.g. "£" with " pound ".
` - `bool $use_str_to_lower [optional]Use "string to lower" for the input.
` - `bool $use_transliterate [optional]Use ASCII::to_transliterate() for unknown chars.
` **Return:** - `stringA string that has been converted to an URL slug.
` -------- #### to_transliterate(string $str, string|null $unknown, bool $strict): string ↑ Returns an ASCII version of the string. A set of non-ASCII characters are replaced with their closest ASCII counterparts, and the rest are removed unless instructed otherwise. EXAMPLE:
ASCII::to_transliterate('déjà σσς iıii'); // 'deja sss iiii'
**Parameters:**
- `string $str The input string.
` - `string|null $unknown [optional]Character use if character unknown. (default is '?') But you can also use NULL to keep the unknown chars.
` - `bool $strict [optional]Use "transliterator_transliterate()" from PHP-Intl` **Return:** - `string
A String that contains only ASCII characters.
` -------- ## Unit Test 1) [Composer](https://getcomposer.org) is a prerequisite for running the tests. ``` composer install ``` 2) The tests can be executed by running this command from the root directory: ```bash ./vendor/bin/phpunit ``` ### Support For support and donations please visit [Github](https://github.com/voku/portable-ascii/) | [Issues](https://github.com/voku/portable-ascii/issues) | [PayPal](https://paypal.me/moelleken) | [Patreon](https://www.patreon.com/voku). For status updates and release announcements please visit [Releases](https://github.com/voku/portable-ascii/releases) | [Twitter](https://twitter.com/suckup_de) | [Patreon](https://www.patreon.com/voku/posts). For professional support please contact [me](https://about.me/voku). ### Thanks - Thanks to [GitHub](https://github.com) (Microsoft) for hosting the code and a good infrastructure including Issues-Managment, etc. - Thanks to [IntelliJ](https://www.jetbrains.com) as they make the best IDEs for PHP and they gave me an open source license for PhpStorm! - Thanks to [Travis CI](https://travis-ci.com/) for being the most awesome, easiest continous integration tool out there! - Thanks to [StyleCI](https://styleci.io/) for the simple but powerful code style check. - Thanks to [PHPStan](https://github.com/phpstan/phpstan) && [Psalm](https://github.com/vimeo/psalm) for really great Static analysis tools and for discover bugs in the code! ### License and Copyright Released under the MIT License - see `LICENSE.txt` for details.