Closed ptynecki closed 7 years ago
Hi @Katharsis,
First of all thank you very much for taking time on reporting a possible bug. I really appreciate it. This is the way to improve cucco.
I've been checking the behavior you comment and in this case the output is the expected one. In the first execution, the one with default normalizations, normalizations are applied this way:
Here you can see the execution:
$ cucco normalize 'Protein Recommendations for Bodybuilders: In This Case, More May Indeed Be Better.'
Protein Recommendations Bodybuilders Case
The normalization you propose would look like this:
$ cat config.yaml
normalizations:
- remove_extra_whitespaces
- remove_accent_marks
- remove_stop_words
- replace_hyphens
- replace_punctuation
- replace_symbols
$ cucco -c config.yaml normalize 'Protein Recommendations for Bodybuilders: In This Case, More May Indeed Be Better.'
Protein Recommendations Bodybuilders Case Better
Note that in the last case I'm omitting the values for replacement
as the value you set is actually the default value.
So, I'm closing this issue as I don't think is a real bug. If I didn't understand you or if you think the behavior should be different, please, feel free to comment and reopen it.
Happy normalization ;)
Hi guys,
Let's say that I wanna normalise that string:
Without extra Cucco setup (normalizations) I received:
With extra Cucco setup:
I received:
My question is: where is the rest part of the string?