avian2 / unidecode

ASCII transliterations of Unicode text - GitHub mirror
https://pypi.python.org/pypi/Unidecode
GNU General Public License v2.0
516 stars 62 forks source link

using unidecode with text file #78

Closed lazezo2 closed 2 years ago

lazezo2 commented 2 years ago

I'm using unidecode to remove accent from french words, it work perfect if the word is declared as string like accented_string ="Málaga" but doesn't work as expected if i read the word from text file!

this code working well

from unidecode import unidecode accented_string = 'Málaga' unaccented_string = unidecode.unidecode(accented_string) print(unaccented_string)

the output is "Malaga" now i want to do the same, but by reading text file

from unidecode import unidecode import fileinput fr_txt="fr.txt" for name in fileinput.input([fr_txt]): clean = name.replace("\n", "") unaccented_string = unidecode.unidecode(clean) print(unaccented_string )

the output is "M',laga" !!! so what's wrong? i tried also this code

import io from unidecode import unidecode fr_txt="txt.txt" f = io.open(fr_txt, mode="r", encoding="utf-8") f.read() for name in f: clean = name.replace("\n", "") line = unidecode(clean) print(line)

the output is "M',laga" !!! any idea?

import io from unidecode import unidecode fr_txt="1.txt" f = io.open(fr_txt, mode="r", encoding="utf8") f.read() for name in f: clean = name.replace("\n", "") line = unidecode(clean) print(line)

avian2 commented 2 years ago

See Frequently Asked Question number 6: "Unidecode produces completely wrong results" in the README file.