gregjacobs / Autolinker.js

Utility to Automatically Link URLs, Email Addresses, Phone Numbers, Twitter handles, and Hashtags in a given block of text/HTML
MIT License
1.48k stars 238 forks source link

[Tiktok mentions] - parsing does not match tiktok mention behaviour #406

Open RossCurry opened 10 months ago

RossCurry commented 10 months ago

Users on the TIKTOK platform can include hyphens '-' in their nicknames. (possibly other characters) As an example: user with nickname '---------'

Autolinker is following the available TIKTOK documentation for usernames (eventually used in a URL).

But TIKTOK, in their documentation for nicknames, don't mention any such restrictions.

I think its pretty logical to assume the same restrictions, however this is not the case.

The problem with all of this is that TIKTOK uses a users nickname for mentions. So mentions that include hyphens (and possibly more characters (hard to test, I can only change my nickname once a week) get cut short, and so the matchedText isn't correct.

Some more testing will have to be done, but I think eventually the REGEXP here for TIKTOK should be changed.

I'll try and test a bit more as soon as I can to determine exactly what characters are accepted for a nickname.

Update: i've since opened another 2 accounts with nicknames: r@$$…€()44¥ & social hub spaces I would say there are almost no restrictions to what a nickname can be. So parsing text: 'Keep trying @social hub spaces' I got matchedText: '@social',

Possibly the white space is an edge case that would be impossible to cater for 🤷🏻

gregjacobs commented 10 months ago

Hey Ross. Yeah, I don’t know how we could possibly support spaces for usernames. We couldn’t possibly know where to end the username in that case. For instance, if we had the string “check out @someone on TikTok”, we wouldn’t be able to know to make the username “someone”, “someone on”, or “someone on TikTok”.

As far as adding hyphens goes, I think that’s a safe addition. I guess in theory we could also add every other non-whitespace character if TikTok is really that liberal in their username allowance!

gregjacobs commented 10 months ago

Feel free to submit a PR btw. Make sure to include tests