jbesomi / texthero

Text preprocessing, representation and visualization from zero to hero.
https://texthero.org
MIT License
2.89k stars 239 forks source link

`remove_punctuation()` is not removing "\" #221

Open batmanscode opened 2 years ago

batmanscode commented 2 years ago
>>> import texthero as hero
>>> import pandas as pd
>>> import string
>>>
>>> s = pd.Series(rf"{string.punctuation}")
>>> hero.remove_punctuation(s)
0     \ 
dtype: object
richecr commented 1 year ago

Hi, I could act on this question. Do you have any suggestions or can I start working from scratch?

A solution that could even improve the performance of the method:

import string

s = "dakmdalk\@....dada"

exclude = set(string.punctuation)
table = str.maketrans('', '', string.punctuation)
a = s.translate(table)

print(a)
# "dakmdalkdada"
jbesomi commented 1 year ago

Hey, happy to receive your PR @richecr !