Alexey-T / CudaText

Cross-platform text editor, written in Free Pascal
Mozilla Public License 2.0
2.39k stars 166 forks source link

Wrapped regex back search doesn't work #5538

Closed pintassilgo closed 3 weeks ago

pintassilgo commented 1 month ago
test
test

Enable .* and O. Search for test. Press F3 many times: it works. Press Shift+F3 many times: it no longer works after you reach the match in first line because find previous + regex + wrapped doesn't work.

It works on any other editor such as VSCode, Sublime and Kate, it's clearly a bug.

Alexey-T commented 1 month ago

My text:

wwwww
ee
www
ee
www
ee

What text do you search for?

pintassilgo commented 1 month ago

What text do you search for?

Initial comment has steps to reproduce.

pintassilgo commented 1 month ago

wrapped back search for 'w{3,}' or 'www' cannot wrap

This is the same bug. It should wrap.

pintassilgo commented 1 month ago

Your example is interesting.

Searching for www and pressing Shift+F3 when current match is the third line, it first matches "wwWWW", then "wWWWw", then it fails.

Other editors find "WWWww" even in backward search.

Firefox finds "wwWWW" just like Cuda, but next Shift+F3 already wraps, it doesn't match an intersection of other match.

Alexey-T commented 1 month ago

@User4martin Hello. Did you test backward search in TRegExpr? Somehow it is not working for simple strings: w{3,}, www.

pintassilgo commented 1 month ago

Cuda only matches intersection of other matches when regex is enabled. With regex off, it has the same results as Firefox.

But it's weird because find highlights aren't updated. Find highlights are always in normal direction, so in first line the highlight is always "WWWww".

pintassilgo commented 1 month ago

wrapped back search for 'w+' does wrapping Ok

It's with bug too. Look carefully. It's unable to match the first "w" of the first line. Cumulative Shift+F3 matches up to "wWWWW". Next Shift+F3 should produce "WWWWW", but instead it wraps and matches the last W from the last line.

pintassilgo commented 1 month ago

By the way, this result doesn't sound right to me. w+ should always match all consecutive w.

If its wwwwwwwwww, "w+" should match all the W's from the start, not adding one by one each time you press Shift+F3. It works as expected for F3, so why for Shift+F3 it only matches a single W instead of the entire W-word?

Alexey-T commented 1 month ago

Yes, this is the issue too.

Alexey-T commented 1 month ago

Solved part of the issue. now back search wraps OK. Tofix: back search for 'w+' cannot find longest text in 'wwwwwww'.

@user4martin, don't bother, it is my task.

pintassilgo commented 1 month ago

After thinking a bit, I believe Firefox and Cuda are in the wrong side, other editors and Chromium-based browsers are right.

Having different results for normal and backward search is confusing, as the wrong higlights prove. Backward find should mean just navigate results index backward, as the name says: "find previous" i.e. navigate to the previous index among all the occurrences in a normal search.

So if we have this:

wwwww
www

And we search for "www", the match in first line should be always WWWww, never wwWWW, even when using "find previous".

So if current match is in line 2, "find previous" should jump to first match, not to run find again in opposite direction (from right to left).

The only scenario to match wwWWW when you search for WWW is if caret is right after the second w (i.e. ww|www), because search always starts after the caret..

Alexey-T commented 1 month ago

Hm, this new idea is not simple to implement. I didn't do it yet. Beta updated, let's test.

pintassilgo commented 1 month ago

I don't know what changed in the beta, it didn't fix the main issue. Repeat steps from initial comment, the issue is present the same way, it's still stuck to first match. When current match is at line 2 and you press Shift+F3, current match is switched to line 1, but then Shift+F3 no longer works because it doesn't wrap to line 2.

pintassilgo commented 1 month ago

I see beta fixed the w+ thing, But not the main issue.

Alexey-T commented 1 month ago

Indeed. On this text this is reproduced more weird:

test
test
test
test
test
Alexey-T commented 1 month ago

Fixed, beta updated.

pintassilgo commented 1 month ago

Thanks, I confirm initial issue is fixed.

The last thing is what I suggested here. In other words:

"Find previous" should be the same as "Find next", with the only difference being the index of current match. While find next gets i+1, find previous retrieves i-1.

So if you text is:

wwwww
ee
www

And you search for www, the match in line 1 must be the "WWWww" even when pressing Shift+F3.

pintassilgo commented 1 month ago

I believe this would make code even simpler. Because it will no longer require two different searches for find next and find previous. They will be the same, the only difference is in the index of current match (i+1 for next, i-1 for previous).

Alexey-T commented 1 month ago

I do not have easy idea how to implement it (without first getting all matches, and jump between them - this is bad idea).

Alexey-T commented 1 month ago

Getting all matches first - on big files - is very slow and very memory consuming.

pintassilgo commented 1 month ago

You could reuse Hi code. If Hi was able to detect previous match, just select it on Shift+F3. If there is no previous match on Hi range, then keep current way.

This would make Shift+F3 consistent with Hi marks, which is currently an issue when you press Shift+F3 as in the wwwww example when you search for www then press Shift+F3.

Alexey-T commented 1 month ago

Easy to say it, but it is not the thing which I can do: considering Hi marks. finder is indenendent thing, from Hi code. I decline this last idea. I don't know how to do it w/o getting all matches first.

pintassilgo commented 1 month ago

Well, I can only think of, if not reusing Hi code, on cloning part of it to finder (or at least its logic). So finder would get all matches only within some limited range before the caret to avoid degraded performance. If there is previous match within this range, select it (when user pressed Shift+F3). It there is not, run current Cuda find previous.

Alexey-T commented 1 month ago

all matches only within some limited range before the caret

Not easy to detect this range length. Find-back can jump very far, e.g. to the file begin.

pintassilgo commented 1 month ago

Find-back can jump very far, e.g. to the file begin.

The range can be whatever you find reasonable to not degrade performance, for instance 100k chars or 1k lines (whichever is reached first). If find-back can find a match within this range, stop and jump to it. If it can't, do what Cuda already does today when user presses Shift+F3.

So for a giant file where caret is at the end and wwwww is at the beginning, find-prev for "www" would be as it is today (which I think it's wrong), but that's a rare exception. For most cases, this would fix the expected behavior. So it's not a perfect solution, but it's the best I can imagine without hurting performance, and I believe the result would be good enough.

Alexey-T commented 1 month ago

Let this issue be opened for some time (month?). Seems I don't want to implement it now. code will be very dirty/heavy, bad.