tc39 / proposal-regex-escaping

Proposal for investigating RegExp escaping for the ECMAScript standard
http://tc39.es/proposal-regex-escaping/
Creative Commons Zero v1.0 Universal
363 stars 32 forks source link

ChatGPT says this exists in JavaScript (aka hurry up) #57

Closed agm1984 closed 1 year ago

agm1984 commented 1 year ago

ChatGPT just told me to use RegExp.escape() but it was throwing error in my console. ChatGPT said it was added in ECMAScript 2019 which I found no evidence for, so I asked ChatGPT and it apologized and said it was actually added in ECMAScript 2021.

Upon Googling, I found this repo, so is it stagnant?

This functionality appears to stem from Ruby or probably different languages too, and it would be most efficient solution for me to escape while doing new RegExp(searchTerms, 'i').test(value) when searchTerms is | or \.

ChatGPT gave me a custom solution that looks reasonable:

const escapedTerms = searchTerms.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
const regex = new RegExp(escapedTerms, 'i');
const isMatch = regex.test(value);

In my current application, I found this third party dependency (Oruga UI) that is using a custom solution that is the same but oddly different:

/**
 * Escape regex characters
 * http://stackoverflow.com/a/6969486
 */
export function escapeRegExpChars(value) {
    if (!value) return value;
    return value.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, '\\$&'); // eslint-disable-line no-useless-escape
}

After probing ChatGPT further, an issue there seems to stem from its knowledge cutoff of 2021, so my final takeaway is that this proposal looks simple but possibly stagnant and should be kicked.

Given that it involves a final code as simple as ''.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, '\\$&'), I would say hurry up and deploy it, and if we're too scared, maybe start with an options object that features include and exclude to opt-in characters or opt-out characters beyond the reasonable default.

I want to have the following code:

new RegExp(RegExp.escape(searchTerms), 'i').test(value)
ljharb commented 1 year ago

ChatGPT lies, and you shouldn't use it for anything you're not already an expert in.

While this is amusing, I'm going to close it because the proposal repo is not the best venue for shitposting.

agm1984 commented 1 year ago

My key observation at the moment is that in order to solve this bug, I need to import an escape utility function into 50 code locations, or make a more drastic change to the logic to avoid duplication. If RegExp.escape worked, It would have been one find&replace operation. I guess alternately I could find&replace searchTerms to searchTerms.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'), but then it's more imperative and I don't even know if I trust that yet.

After more careful analysis, it is comical to escape regex while creating regex. It strikes me as an odd way to perform case insensitive search to boolean. Perhaps it should be instead changed to value.toLowerCase().includes(searchTerms.toLowerCase()). But I can't remember what benefits or tradeoffs are implicated with that.

After reading #43 I don't understand why adding to RegExp is dangerous. Are people counting and expecting the number of keys on the prototype in legacy code? Otherwise why are people constantly re-inventing the wheel with regex-char escape utility functions. That should be a solved problem. RegExp.escape() is a very subjectively-natural solution to stop a datum from contributing to a regex expression.

It's fine; everything will be ok. I just want to return a boolean for whether or not some case insensitive value is in a string. It makes me curious if é would match against e.

In closing, in the past I also use Fuse.js for fuzzy matching searches. I thought it was overkill for a quick string match.

At any rate, I think its fine to close this issue. It wasn't intended to be a shitpost, and its interesting to note that my arguments are centric to searching a string more so than escaping regex.

ljharb commented 1 year ago

The proposal is not stagnant, but RegExp.escape has never existed, so currently you should be importing a package for it.

Many arguments exist for this feature, which is why I'm championing it, but "a language model is extremely confidently wrong, again" is never going to be one :-)

agm1984 commented 1 year ago

To be fair I didn't ask ChatGPT how to search the string. I asked it how to fix the regex problem in new RegExp(searchTerms, 'i').test(value), and it brilliantly told me to just escape the searchTerms. It's kinda obvious but it was nice to save the brain power.

I don't remember where I originally got this regex snippet from--some ThemeForest template after which I enjoyed the solution because it somehow-sidesteps issues with toLowerCase exploding, or maybe it doesn't. I just appreciate the fact that I need to be serious about why I would use it, when talking to the TC39 entity.

Maybe it's a weird use case, but you can add this to the list of scenarios (in the context of Vue JS):

const escapeRegExpChars = (value) => {
    if (typeof value !== 'string') return value; // avoid explosions
    return value.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
};

// note: you can search multiple fields
const filteredSandwiches = computed(() => {
    if (!searchTerms.value) return kitchen.sandwiches;

    return kitchen.sandwiches.filter((sandwich) => (
        new RegExp(escapeRegExpChars(searchTerms.value), 'i').test(sandwich.bread) ||
        new RegExp(escapeRegExpChars(searchTerms.value), 'i').test(sandwich.meat) ||
        new RegExp(escapeRegExpChars(searchTerms.value), 'i').test(sandwich.condiments.join(','))
    ));
});

What I'm not certain about, because I saw someone mention it in one of the regex-escape utility StackOverflow posts is that instantiating RegExp while iterating may be costly, so this at least implies warning to memory allocation abuse.

Sorry for continuing to generate issue notifications.