secondlife / jira-archive

2 stars 0 forks source link

[BUG-233078] LSL (Near) Future Proofed Inexact Matching Plan #10230

Open sl-service-account opened 1 year ago

sl-service-account commented 1 year ago

How would you like the feature to work?

In new LSL functions which do any kind of searching or matching, could we have a "match type" integer parameter added to solve the issue of a growing number of variants for every conceivable way of matching a needle against a haystack (at least for the foreseeable future).

This match type would have the options (numeric values);

Links

Related

Original Jira Fields | Field | Value | | ------------- | ------------- | | Issue | BUG-233078 | | Summary | LSL (Near) Future Proofed Inexact Matching Plan | | Type | New Feature Request | | Priority | Unset | | Status | Accepted | | Resolution | Accepted | | Created at | 2022-12-14T08:22:54Z | | Updated at | 2023-05-29T16:10:03Z | ``` { 'Build Id': 'unset', 'Business Unit': ['Platform'], 'Date of First Response': '2022-12-14T13:11:40.526-0600', 'How would you like the feature to work?': 'In new LSL functions which do any kind of searching or matching, could we have a "match type" integer parameter added to solve the issue of a growing number of variants for every conceivable way of matching a needle against a haystack (at least for the foreseeable future).\r\n\r\nThis match type would have the options (numeric values);\r\n* MATCH_EXACT — current LSL stock standard full exact match.\r\n* MATCH_PARTIAL — finds the needle string anywhere within the target.\r\n* MATCH_PREFIX — finds the needle specifically at the start of target.\r\n* MATCH_SUFFIX — finds the needle specifically at the end of target.\r\n* MATCH_REGEX — deploys the magical wonders of regex in the match.\r\nAnd then supplement those with some bit flags (up at the top end):\r\n* MATCH_NOCASE — case insensitive matching.\r\n* MATCH_REVERSED — swaps the needle and haystack strings.\r\n(A few of the usual regex flags may also be suitable to be added in here, there should still be plenty of bits available for a very long time.)', 'ReOpened Count': 0.0, 'Severity': 'Unset', 'Target Viewer Version': 'viewer-development', 'Why is this feature important to you? How would it benefit the community?': 'As above, people keep wanting various matching options, especially with the hope of regex permeating the entirety of LSL searching/matching, and there are already cases with more than one function to do the same search in different ways, and even regex being deployed as the only option as a single "do everything" solution. But I imagine regex is a little heavy-weight, when all you want is a prefix match, and there are some more esoteric matches that while not particularly complex to do as a built-in, and utterly break the benefits of the built in search functions trying to implement them in LSL, also just really aren\'t worth adding a whole raft of variant functions just to support them everywhere — not to mention the multiplicative effect of those flags … if there\'s five search type functions, with even just the most common three match types, and just the nocase flag (which pretty much EVERYONE wants, btw!!!), that\'s already 30 functions to provide those few options on a consistent basis, and if you add one in one place, you\'re going to get calls to add it everywhere else, too. This is an understandable _huge_ push-back against adding any new match options, especially any of the more esoteric ones.\r\n\r\nTo possibly better explain my thinking of this, I imagine this being implemented by taking this match type parameter, and the "needle" string, and passing them to a function that returns a suitable matcher delegate, which is then used for comparisons against whatever the function will be searching within (list items, LSD keys, eKVP keys, etc.). I also imagine the MATCH_REVERSED flag would probably generally be handled by removing that flag, calling the match type resolver function recursively, and then wrapping the returned delegate in another delegate that simply flips the arguments (reversed matching might be problematic for regex — though I hope not, because I have a use for it right now — in which case I\'d recommend throwing a debug error and doing nothing — to allow it to be implemented later if a means is found).\r\n\r\nApplying this pattern going forwards (and maybe even retroactively to some of the still bleeding-fresh additions where developers should still be active enough to make the necessary updates) will allow for the addition of less common matches like a suffix match, and leave regex for the complex cases where it\'s actually needed — because you _know_ people are going to use the regex "^.*suffix$". (This method would also allow the existing specific match type functions to be retroactively labelled "convenience shortcuts".)\r\n\r\nThen ASAP, a simple llStringMatch function taking two strings and this match integer parameter would also be much preferred to having to use llListFindListMatch (or maybe llListFindListEx(tended) ), with it\'s necessity of wrapping the two strings into lists. Both of these functions also taking a start parameter (with negative start to match backwards), and stride parameter on the list version (with the more sane stride ("slice") semantics).\r\n\r\n### If this goes ahead, some additional "wishlist" options:\r\n\r\nMATCH_GLOB would be another useful match type, offering a kind of simplified regex (could just be trivially converted into regex); . matches a normal full stop character, * is /.*/, ? matches /.?/, [] is the same as per regex with [^] being functionally equivalent to the regex dot ("not nothing"), and _maybe_ |, ^, and $ as per regex also (without grouping, ^ and $ take precedence with an implied group around everything in between, and those last three could be offered with a MATCH_EXTENDED "extended syntax" switch — which may come in useful down the road with the regex matching also — as would the [] match pairing up with an immediately subsequent * and ? also as per regex).\r\n\r\nA MATCH_NUMERIC flag which would augment the string matches by treating a sequence of one or more digits as numbers — any time the comparison goes to compare two digits, it would consider the sequence of consecutive digits from that point on in each string, as though they were a single "character" of that ordinal value.\r\n\r\nAnd yet another (unlikely) option I would personally _love_ to see, is another match option MATCH_CUSTOM, that allows a custom match predicate to be used; the string would be the name of an appropriate function (or an additional ID parameter passed to an event), which is passed the needle and the haystack strings, the user function (or event) then returns 0 or !0 as appropriate. I can however see a _whole_ bunch of issues with this one, though, and it would be utterly obviated by _reliable_ iteration or batching over the KVP\'s (as per another of my issues, a start and count parameter is not sufficient to give this in the presence of concurrent addition or removal of items). I\'d also mention using an event for this would be quite horrible, requiring either multiple states, or for it to be structured as a switch statement, causing a whole bunch of extra string comparisons before it even gets to the actual test.', } ```
sl-service-account commented 1 year ago

JIRAUSER341305 commented at 2022-12-14T08:42:10Z

I will also add, as an interim if the regex implementation is deemed good enough to indeed consider it a viable "do everything" match (ie. I'm thinking, if you'd at least almost be willing to replace all the current exact string matches with regex calls), then instead of a match type parameter, it might be possible to have a "match formatter" function that takes in a match string and match type, and outputs a suitable regex string.

This DOES NOT cover the MATCH_REVERSED flag, however, which I still believe to be exceedingly useful.

However, such a function purely for the purpose of regex-escaping is also already needed in LSL; I am already afraid a good many people are going to be basically passing user-supplied text directly into the regex parser, and just hoping said user hasn't put anything "regexy" into it.  And so a function llMakeRegex(userSuppliedText, MATCH_PREFIX) would go a long way to resolving much impending doom.

I would like to stress, however, this is NOT a complete substitute for adding a match type parameter, but merely a workable interim patch with it's own additional utility.

sl-service-account commented 1 year ago

Spidey Linden commented at 2022-12-14T19:11:41Z

Issue accepted. We have no estimate when it may be implemented. Please see future release notes for this fix.