gaspardpetit / phonetic_fr-py

A python port of EdouardBERGE/phonetic
MIT License
2 stars 1 forks source link

PyPI version PylintPython package Python versionsLicense: MIT

phonetic-fr

A Soundex-Like Phonetic Algorithm in Python for the French Language

For multilanguage phonetic comparison of words, see https://github.com/gaspardpetit/phonetic_distance-py

Purpose

phonetic-fr implements a Soundex phonetic algorithm, used to compare words by their sound when pronounced in French. The algorithm is particularly useful for tasks such as matching similar-sounding words, especially in cases where the spelling might vary.

How to install

pip install phonetic-fr

Usage in shell

echo "Le ver vert glisse vers le verre" | phonetic_fr

Prints:

L VER VER GLIS VER L VER

Usage in Python

from phonetic_fr import phonetic

# Obtain phonetic representation of a word
example = "python"
result = phonetic(example)
print(f"{example} -> {result}")

Prints

python -> PITON

Phonetic results can be used to compare similar sounding words:

from phonetic_fr import phonetic

# Compare two names with sounding alike
are_alike = phonetic("Gilles") == phonetic("Jill")
print(f"Gilles sounds like Jill: {are_alike}")

Prints

Gilles sounds like Jill: True
from Levenshtein import distance
from phonetic_fr import phonetic

# Improve Levenshtein's distance
word_a = "drapeau"
word_b = "crapaud"
raw_distance = distance(word_a, word_b)
print(f"Levenshtein distance of '{word_a}' and '{word_b}': {raw_distance}")
phonetic_distance = distance(phonetic(word_a), phonetic(word_b))
print(f"Phonetic Levenshtein distance of '{word_a}' and '{word_b}': {phonetic_distance}")

Prints

Levenshtein distance of 'drapeau' and 'crapaud': 3
Phonetic Levenshtein distance of 'drapeau' and 'crapaud': 1

Description

phonetic-fr is a phonetic algorithm for the French language, similar to the Soundex algorithm used for English. Here is a summary of its functionality:

License

phonetic-fr is released under the MIT license. Feel free to use, modify, and distribute it according to the terms of the license.

Credits

Changelog

Changes over the original port are being tracked in the Changelog