Closed jeffersonlicet closed 7 years ago
@jeffersonlicet You're very much welcome. Thanks for the regex input, I will need to create some instrument tests to make sure that pattern will work on any case.
In the meantime, do you have more test cases? I'm sorry I'm not familiar with Spanish naming customs.
We use accent marks everywhere.
For a better internalization I allowed accent marks at the start of the word:
([##]+)([0-9A-Z_À-ÖØ-öø-ÿ]*[A-Z_]+[a-z0-9_üÀ-ÖØ-öø-ÿ]*)
I think all spanish words will work, for example:
You can play with it here: Regexr
I've added RegexTest which for some reason passes with #(\\w+)
. While ([##]+)([0-9A-Z_À-ÖØ-öø-ÿ]*[A-Z_]+[a-z0-9_üÀ-ÖØ-öø-ÿ]*)
only captures Creación
out of CreaciónDivina
.
I honestly don't know why the result from Regexr can be different with junit test. Would you kindly run the test on your computer and see if it passes?
Sorry, i used the i flag that means it's not case sensitive. Change it to:
static final Pattern PATTERN_HASHTAG = Pattern.compile("(?i)([##]+)([0-9A-Z_À-ÖØ-öø-ÿ]*[A-Z_]+[a-z0-9_üÀ-ÖØ-öø-ÿ]*)");
Where (?i) disables the case-sensitive engine.
Okay that works but I'm thinking of using (?i)[##]([0-9A-Z_À-ÖØ-öø-ÿ]*[A-Z_]+[a-z0-9_üÀ-ÖØ-öø-ÿ]*)
to prevent multiple hashtags. What do you think?
Ohh. Yes. That's perfect.
Sorry for late response, version 0.12.0 has been published with that regex pattern. The patterns are also customizable now with static methods.
Thank you for your help and let me know if you have any other improvement or fix! :)
Thanks you very much for this project.
I was taking a look into your implementation and i think that you could update the Hashtag regular expression to accept non english words etc..
Example:
#(\w+)
fails with #MañanaSolution:
([##]+)([0-9A-Z_]*[A-Z_]+[a-z0-9_üÀ-ÖØ-öø-ÿ]*)
Hope to help you work on it soon. Jeff.