sruupl / batflat

Lightweight, fast and easy CMS for free. Bootstrap ready. https://batflat.org
MIT License
134 stars 54 forks source link

Not working category (tags) in Russian #34

Closed KonstantinFromRussia closed 4 years ago

KonstantinFromRussia commented 5 years ago

Hello! Your system, as I understand it, uses tags as categories. So, if the tags are written in English - this system works, if in Russian - no. Agree that this is not quite good. Missing sense of localization.

sim2github commented 5 years ago

Iconv with hardcoded pl_PL locale in createSlug function don't return translited symbols. https://github.com/sruupl/batflat/blob/bd0387d210545a2acffec08fc5279f03cc9c1c6a/inc/core/lib/functions.php#L56

My propose:

function createSlug($text)
{
  $dict = [
    //Russian translit
    'а' => 'a',
    'б' => 'b',
    'в' => 'v',
    'г' => 'g',
    'д' => 'd',
    'ж' => 'zh',
    'з' => 'z',
    'и' => 'i',
    'й' => 'j',
    'е' => 'e',
    'ё' => 'yo',
    'о' => 'o',
    'п' => 'p',
    'р' => 'r',
    'к' => 'k',
    'л' => 'l',
    'м' => 'm',
    'н' => 'n',
    'т' => 't',
    'ч' => 'ch',
    'с' => 's',
    'ц' => 'c',
    'у' => 'u',
    'ф' => 'f',
    'х' => 'h',
    'ш' => 'sh',
    'щ' => 'shch',
    'э' => 'eh',
    'ю' => 'yu',
    'я' => 'ya',
    'ы' => 'y',
    'ь' => '',
    //Ukrainian additional symbols
    'і' => 'i',
    'є' => 'ye',
    'ї' => 'yi',
    //Some of the special characters
    ':' => '-',
    ';' => '-',
    '.' => '-',
    ',' => '-',
    ' ' => '-',
    '\'' => ''
  ];

  $text = strtr(strtolower(str_limit(trim($text), 255, '')), $dict);
  $text = iconv('utf-8', 'ascii//translit', $text);
  return preg_replace('#[^a-z0-9\-]#si', '', $text);
}

Test: Śęłąāčēģšž: .,;ΑαΒβΓγΔδΕεΖζ Ηабвгдеёжзийклмнн -> /Selaacegsz------abvgdeyozhzijklmn

Greek transliteration also have problems...

KonstantinFromRussia commented 5 years ago

Thank you. But the decision is not complete. Now the problem with the tags in Russian, written with a Capital letter. For example: tags - тег1 тег2 тег3 - normally saved when editing a post tags - Тег1 Тег2 Тег3 - not saved I write a post, add one tag - Пост1, but I can’t add Пост2 tag, when post is saved, Пост1 tag remains. Tag written with a capital letter - there can be only one !!!

KonstantinFromRussia commented 5 years ago

Sorry The problem with the tags in Russian with a capital letter was in the original version.

Yes, and there is a problem with transcription of slug too. In the English version, the slug is set automatically. In the Russian version, the slug must be set manually.

KonstantinFromRussia commented 5 years ago

Understood! It should be like this:

function createSlug($text)
{
  $dict = [
    //Russian translit
    'а' => 'a',
    'б' => 'b',
    'в' => 'v',
    'г' => 'g',
    'д' => 'd',
    'ж' => 'zh',
    'з' => 'z',
    'и' => 'i',
    'й' => 'j',
    'е' => 'e',
    'ё' => 'yo',
    'о' => 'o',
    'п' => 'p',
    'р' => 'r',
    'к' => 'k',
    'л' => 'l',
    'м' => 'm',
    'н' => 'n',
    'т' => 't',
    'ч' => 'ch',
    'с' => 's',
    'ц' => 'c',
    'у' => 'u',
    'ф' => 'f',
    'х' => 'h',
    'ш' => 'sh',
    'щ' => 'shch',
    'э' => 'eh',
    'ю' => 'yu',
    'я' => 'ya',
    'ы' => 'y',
    'ь' => '',
    'А' => 'A',
    'Б' => 'B',
    'В' => 'V',
    'Г' => 'G',
    'Д' => 'D',
    'Ж' => 'Zh',
    'З' => 'Z',
    'И' => 'I',
    'Й' => 'J',
    'Е' => 'E',
    'Ё' => 'Yo',
    'О' => 'O',
    'П' => 'P',
    'Р' => 'R',
    'К' => 'K',
    'Л' => 'L',
    'М' => 'M',
    'Н' => 'N',
    'Т' => 'T',
    'Ч' => 'Ch',
    'С' => 'S',
    'Ц' => 'C',
    'У' => 'U',
    'Ф' => 'F',
    'Х' => 'H',
    'Ш' => 'Sh',
    'Щ' => 'Shch',
    'Э' => 'Eh',
    'Ю' => 'Yu',
    'Я' => 'Ya',
    'Ы' => 'Y',
    'Ь' => '',
    //Ukrainian additional symbols
    'і' => 'i',
    'є' => 'ye',
    'ї' => 'yi',
    //Some of the special characters
    ':' => '-',
    ';' => '-',
    '.' => '-',
    ',' => '-',
    ' ' => '-',
    '\'' => ''
  ];

  $text = strtr(strtolower(str_limit(trim($text), 255, '')), $dict);
  $text = iconv('utf-8', 'ascii//translit', $text);
  return preg_replace('#[^a-z0-9\-]#si', '', $text);
}

In this version, both tags and slugs work!

sim2github commented 5 years ago

Seems like strtolower not work with UTF-8. Last rule cut off capitalized letters. mb_strtolower works:

function createSlug($text)
{
  $dict = [
    //Russian translit
    'а' => 'a',
    'б' => 'b',
    'в' => 'v',
    'г' => 'g',
    'д' => 'd',
    'ж' => 'zh',
    'з' => 'z',
    'и' => 'i',
    'й' => 'j',
    'е' => 'e',
    'ё' => 'yo',
    'о' => 'o',
    'п' => 'p',
    'р' => 'r',
    'к' => 'k',
    'л' => 'l',
    'м' => 'm',
    'н' => 'n',
    'т' => 't',
    'ч' => 'ch',
    'с' => 's',
    'ц' => 'c',
    'у' => 'u',
    'ф' => 'f',
    'х' => 'h',
    'ш' => 'sh',
    'щ' => 'shch',
    'э' => 'eh',
    'ю' => 'yu',
    'я' => 'ya',
    'ы' => 'y',
    'ь' => '',
    //Ukrainian additional symbols
    'і' => 'i',
    'є' => 'ye',
    'ї' => 'yi',
    //Some of the special characters
    ':' => '-',
    ';' => '-',
    '.' => '-',
    ',' => '-',
    ' ' => '-',
    '\'' => ''
  ];

  $text = strtr(mb_strtolower(str_limit(trim($text), 255, ''), 'UTF-8'), $dict);
  $text = iconv('utf-8', 'ascii//translit', $text);
  return preg_replace('#[^a-z0-9\-]#si', '', $text);
}
KonstantinFromRussia commented 5 years ago

sim2github - Here, now everything works fine! Thank you!

sim2github commented 5 years ago

@klocus according to the documentation setlocale must be set relative to the value of language (like ["en_english"=>"en_US", "pl_polski" => "pl_PL", ... ]), for 2bytes UTF-8 symbols languages (like Russian, maybe Turkish, Indonesian) need transliteration by dictionary. So createSlug must have second parameter for LC_ALL set and extend dictionary for supported languages.

Tested:

KonstantinFromRussia commented 5 years ago

sim2github - How to apply it in practice? What should the correct code look like?

sim2github commented 5 years ago

My changes brake compatibility for Polish language at least. So we need cooperate with contributors to fix this issue for supported language and maybe examine with native speakers that`s all works fine.

If you dont plan use not tested languages - patch function manually till problem not fixed. Im not core developer, so i will wait till @klocus say what to do.

klocus commented 5 years ago

I will fix it soon.

michu2k commented 4 years ago

It won't be fixed in this Bf version, but we will look into this problem in future updates.