joshswan / react-native-autolink

Automatic linking of URLs, phone numbers, emails, handles, and even custom patterns in text for React Native
MIT License
647 stars 82 forks source link

Matching `@username` formatted usernames encounters collisions with generated token strings #78

Open mhuggins opened 1 month ago

mhuggins commented 1 month ago

Hello, thank you for sharing this package!

I am trying to match several patterns of strings within a body of text, namely:

  1. Hash tags (e.g.: #programming),
  2. User names (e.g.: @mhuggins), and
  3. URLs.

I've defined three custom matchers as part of a wrapper component:

import { router } from 'expo-router';
import { Linking, StyleProp, TextStyle } from 'react-native';
import Autolink, { CustomMatcher } from 'react-native-autolink';

const linkStyle: StyleProp<TextStyle> = {
  color: '#0a7ea4',
};

const HashTagMatcher: CustomMatcher = {
  pattern: /#([a-z0-9_\-]+)/g,
  style: linkStyle,
  getLinkText: (replacerArgs) => `#${replacerArgs[1]}`,
  onPress: (match) => {
    const tag = match.getReplacerArgs()[1];
    router.navigate(`/tags/${encodeURIComponent(tag)}`);
  },
};

const UserMatcher: CustomMatcher = {
  pattern: /@([a-z0-9_\.]+)/g,
  style: linkStyle,
  getLinkText: (replacerArgs) => `@${replacerArgs[1]}`,
  onPress: (match) => {
    const userId = match.getReplacerArgs()[1];
    router.navigate(`/users/${encodeURIComponent(userId)}`);
  },
};

const UrlMatcher: CustomMatcher = {
  pattern: /https?:\/\/([\w_-]+(?:(?:\.[\w_-]+)+))([\w.,@?^=%&:\/~+#-]*[\w@?^=%&\/~+#-])/g,
  style: linkStyle,
  getLinkText: (replacerArgs) => replacerArgs[0],
  onPress: (match) => {
    const url = match.getMatchedText();
    Linking.openURL(url);
  },
};

const matchers: CustomMatcher[] = [HashTagMatcher, UserMatcher, UrlMatcher];

export const PostBody = ({ text }: { text: string }) => (
  <Autolink text={text} matchers={matchers} />
);

I then try to use this on a code of text, e.g.:

I am linking to https://google.com because I'm a #corporateshill. All hail @google!

This ends up replacing some matched elements with tokens in the format @__ELEMENT-${uid}-\\d+__@, at which point my HashTagMatcher matches against combinations of string created from these tokens such as @__ and @..

Screenshot 2024-07-10 at 8 52 32 AM

Is there an existing way to avoid this, or is this a bug that needs to be reconciled?

joshswan commented 1 month ago

TLDR: If you turn off the built in URL matching and put your UserMatcher first in the array of custom matchers, you can get around the issue for now: <Autolink url={false} matchers={[UserMatcher, HashTagMatcher, UrlMatcher]} text="text="I am linking to https://google.com because I'm a #corporateshill. All hail @google!" /> works. Alternatively you can tweak the regex of the UserMatcher.

Interestingly, the pattern you're using for the user matcher is capturing the internal "replacement token" that's used to mark matches in the text before rendering. Either that token is going to need to be updated to something super obscure, or the logic needs to change internally. Currently it's @__ELEMENT-${uid}-${counter++}__@, which will cause issues with mention-related regexes for some.

Separately, url was meant to be disabled by default but it's not. Will require a major version bump now.

mhuggins commented 1 month ago

I think it might be beneficial to not utilize tokens that get injected into the string, and to instead compose an array of parts that get joined together at the end. Tokenization will always result in the possibility of conflicts from user-provided regex patterns.

joshswan commented 1 month ago

Agreed. It was an easy way to get the original library working, and didn't cause any issues since I knew the regexes for the built-in functionality. But these custom matchers pose a problem for any potential token pattern.

Unfortunately that means a bit of work to be done haha. Will add to my todo list.