tallforasmurf / PPQT

A post-processing tool for PGDP written in Python, PyQt4, and Qt
GNU General Public License v3.0
4 stars 2 forks source link

search/replace moves the page marker #115

Closed bibimbop closed 11 years ago

bibimbop commented 11 years ago

In the same vein as bug #113 , search/replace that spans one or more page breaks will move them.

For instance in

blabla1 --page break-- blabla2

replacing the tags with (with something like (.*) => \1) will move the page break after the second

blabla1 blabla2 --page break--

tallforasmurf commented 11 years ago

Yeah, sigh. It would, wouldn't it.

This is not susceptible to the fix I used in reflow. Reflow has the nice property that it only cares about whitespace and non-space. So I could stick in a ZWNJ (a nonspace) and reflow still works as long as it adds zero to the logical token length.

That would not be the case in regex replace. (n.b. I can't imagine this being a problem with non-regex replace because it is very hard to imagine a non-regex Find string that would span a page break. Well, ok, after you remove the ---pagebreak// lines and reflow, I suppose then you could search for "word1 word2" and find it spanning a page break. But how often does anyone Find a two-word phrase and replace it?)

But in a regex replace, I have the text that matched the regex pattern as a QString. And I invoke the regex replace on it, and put it back replacing the original found text. But if I first inserted ZWNJ, a unicode character, the replace might replace the wrong characters, because of the inserted character pushing things sideways. Also the replace might actually replace the ZWNJ so it would be gone when I look for it to reposition the page break. So that don't work.

Say the found string is N chars long and starts at offset P. I note a page break J characters into it, so at P+J. The replacement string is M chars long, M could be 0. After the replace, the page break is now at the end of the replacement string, P+M. I could just arbitrarily move it back to a proportional location in the new string, P + int((J/N)*M). Not sure if that is worth doing.

tallforasmurf commented 11 years ago

For time being I am not going to make the change described above. The bug can be reopened if "page break displaced by regex replace" turns out to be a common problem (or if I think of a better way to fix it).

bibimbop commented 11 years ago

This bug should be re-open. Page markers should not move around silently.

tallforasmurf commented 11 years ago

There is a basic underlying problem that causes this, #113, and #147. Fix that and you fix all. Closing this; refer to #147 for more discussion.