crossroadlabs / Regex

Regular expressions for swift
Apache License 2.0
332 stars 33 forks source link

Strings containing emojis produce shifted results #51

Open david-gorski opened 4 years ago

david-gorski commented 4 years ago

I seems that when emojis are present it shifts the resulting groups and matched strings.

For example I was using

let textRegex = "\\[([^]]*)\\]".r!

let input = """
Mexican culture has lots of rich history and great food! 🌯🌯🌯🌯

Avacado's are already incredibly popular and for good reason: They taste good, work in tons or recipes, and are [good for you]<url>{https://www.healthline.com/nutrition/12-proven-benefits-of-avocado}. So maybe you already have [avocado]<trend>{avocado-intake} every day or maybe just every once in a while, but maybe there's even more reasons to love these green fatty fruits! [Avacado's have been shown to improve sleep]<url>{https://www.cbsnews.com/pictures/foods-that-will-help-you-sleep-better/9/}.
"""

for text in textRegex.findAll(in: input).makeIterator(){
    print(text.matched)
}

This produces: d for you]<url cado]<tre cado's have been shown to improve sleep]<url

Instead of the expected: good for you avocado Avacado's have been shown to improve sleep

The shifting is caused by the presence of the emoji. Each emoji shifts the results index by 1, so here its shifted by 4.