Closed wxtrac closed 2 years ago
regex_upstream.zip
(85.1 KiB)Upstream sources for src/regex from Tcl
Mike -
Unfortunately, I don't think wxRE_DEFAULT = wxRE_BEST is acceptable because the docs note that there are some incompatabilities between EXTENDED REs and ADVANCED REs.
Quick note to the lucky person who handles this patch - Be sure to modify/replace src/regex/README and src/regex/COPYRIGHT also...
Quick note to the lucky person who handles this patch - Be sure to modify/replace src/regex/README and src/regex/COPYRIGHT also...
I've redone regex_lib.diff to include updates to the README and COPYRIGHT. The README contains the details of what exactly is changed from the upstream sources.
Also, there is a marked section of regcustom.h that is supposed to be copied into regex.h by the build process. So I've added a script and called it from regex.bkl to do this (on unix at least). You don't have to have it, of course, if you don't think it useful.
And lastly, I noticed today that the bakefile hasn't been changed to give the unicode and non-unicode versions of the regex lib different names. So changed that too in regex.bkl.
We have tests for this BTW. Ryan located Henry Spencer's test suite (reg.test) and included it in the CVS, which was a great idea. We have a test program that can run these tests using wxRegEx. I was about to add a patch for it at the weekend, but I held it back when I saw talk of a testing framework.
The following files are from the previous regex version and can be cvs removed: engine.ih, mkh, regcomp.ih, regerror.ih, WHATSNEW. I don't think regex.3 is needed either.
I don't think regex.3 is needed either. Nope :).
Thanks Mike! Ryan
I've attached rxunittest.diff containing a test program for the regex.
Here's a quick-start checklist, starting in the wxWindows directory with clean sources:
$ unzip -d src/regex regex_upstream.zip $ patch -p0 < regex_lib.diff $ patch -p0 < reg.diff $ patch -p0 < rxunittest.diff $ cd build/bakefiles/ $ bakefile_gen -b wx.bkl $ bakefile_gen -b ../../samples/rxunittest/rxunittest.bkl $ cd ../.. $ autoconf
then cd BUILD_DIR, configure and make cd samples/rxunittest, make and run
With a unicode build 446 tests should succeed, 56 skipped, none should fail. For a non-unicode build around 64 should be skipped, depending on the locale, the rest should succeed.
Would be nice to apply this before 2.5.1 but I don't (and won't) have time to do it... I do think wxRE_BEST idea is not very good if it leads, as I understand it, to interpreting the same RE differently depending on build options. But, again, other than this, it would be good to have this in 2.5.1.
Thanks!
I do think wxRE_BEST idea is not very good if it leads, as I understand it, to interpreting the same RE differently depending on build options.
Ok, gone.
Mike
reg.diff
(41.9 KiB)updates wxRegEx and docs
Vadim - keep in mind that this changes the whole regex library and is bound to cause a couple of headaches.
If all you want to do is to add an wxRE_ADVANCED then you can do that with the current library with the exact same effect with only a couple lines of code - mayby this is the best for 2.5.1, then do this for 2.5.2?
Anyway, if you do choose to apply this patch, please work closely with Mike about the bugs - hopefully it will be less of a problem then last time :).
Mike - thanks for all your work with this - mayby this will let people forget about the first one :).
Ryan
Mike - thanks for all your work with this - mayby this will let people forget about the first one :).
It's all part of the same effort, and it's not going to be forgotten that you did much the work on it.
Teething troubles do get forgotten; it's how things end up that leaves a lasting impression. Hopefully it will end well.
Mike
configure.in is hard coded to just build the console sample when building wxBase. So in rxunittest.diff I've changed it to use the usual list of samples, then remove any that don't mention the console template 'wx_sample_console' in their bakefile. Does that sound reasonable?
I decided I didn't like having splice in the makefiles, to I've taken that out. I've left the script though, to be run manually if you want it.
Also there are some redundant makefiles in src/regex. I'm guessing these can go:
Makefile, makefile.sc, makefile.vc, makefile.wat, regex.dsp, regex.dsw
and there are needed:
regexM8.xml, regexCE.vcp
So the list of files to remove would be:
engine.ih Makefile makefile.sc makefile.vc makefile.wat mkh regcomp.ih regerror.ih regex.3 regex.dsp regex.dsw WHATSNEW
rxunittest.diff
(49.7 KiB)Tests
regex_lib.diff
(29.2 KiB)Updates to src/regex/
I've fixed up wxRegEx so that it can use the system library in Unicode mode (on Unix).
I don't think it's good idea to use wxConvCurrent, it means that the matching will sometimes fail in Unicode mode, so I prefer to play it safe and always use builtin Unicode version of libregex.
regex_upstream.zip contains the upstream sources (already imported)
Does this mean that no "cvs import" is needed/possible?
regex_upstream.zip contains the upstream sources (already imported)
Does this mean that no "cvs import" is needed/possible? I already did it a while back :).
Ryan
I've fixed up wxRegEx so that it can use the system library in Unicode mode (on Unix).
I don't think it's good idea to use wxConvCurrent, it means that the matching will sometimes fail in Unicode mode, so I prefer to play it safe and always use builtin Unicode version of libregex.
Ok, I will redo the patch tomorrow without this.
It seems a shame though. Did you notice that I added an error message so that it's clear when a conversion failure has happened, in 2.4.x you'd just get a segfault (or, presumably, intermittent incorrect behaviour on Windows).
And I also added an explanation of the limitations in the wxRegEx documentation page.
regex_upstream.zip contains the upstream sources (already imported)
Does this mean that no "cvs import" is needed/possible?
Yes, Ryan already imported them like so: cvs import -m "Import regex from tcl 8.4.5" wxWindows/src/regex RXSPENCER TCL_8_4_5
He didn't do a checkout -j since we didn't want to merge the current version with the upstream sources, but rather replace the current version with them.
The cvs manual does have a section titled 'Reverting to the latest vendor release' which might be useful, but it involves the 'cvs admin' command.
So perhaps it is sufficient to just unzip the upstream sources over the current files.
Thanks!
Ok, I will redo the patch tomorrow without this.
Don't bother, that's simple enough I can do it while applying...
It seems a shame though. Did you notice that I added an error message so that it's clear when a conversion failure has happened, in 2.4.x you'd just get a segfault (or, presumably, intermittent incorrect behaviour on Windows).
...but I admit I had only a quick look at it yesterday, so no, I didn't notice it. I think you're right, so let's apply it as-is and raise the issue at wx-dev, it can always be changed after 2.5.1.
So perhaps it is sufficient to just unzip the upstream sources over the current files.
I guess so, thanks!
Ok, I will redo the patch tomorrow without this.
Don't bother, that's simple enough I can do it while applying...
I re-did it already, so I can make the alternative patch available if it's needed, it will save you a few minutes anyway.
Also note that I did rxunittest.diff before the new unit testing framework was unveiled, so it puts a program in the samples directory. I've redone it now using cppunit, I'll send in a patch for that sometime soon.
But you can use the one there to verify the patch, and then 'patch -R' it before committing.
Mike
Applied (with a tiny change: I added wxHAS_REGEX_ADVANCED so that the check can be done in same way as for other features.h things). I'm leaving this open because of the testsuite.
Where can I find wxreg.test and maketestdata mentioned in the test suite (assuming reg.test is the one from src/regex)? I think it should be included with the test... Thanks!
Using env to find perl is better, thanks.
But on Linux the shell tries to invoke 'perl -w' as one name:
$ ./regex.pl /usr/bin/env: perl -w: No such file or directory
which is pretty strange. Looking on the net it seems the exact behaviour you get depends on the OS.
Anyway there's a new pragma 'use warnings' which I can use instead of -w so I've attached a patch to do that, and also change some comments which say that reg.test is in src/regex.
regex.pl puts UTF-8 into regex.inc by default. This is used in Unicode build only? If yes, then I think UTF-8 is fine, if not, then I'm not sure I understand how is it used...
In a non-Unicode build the UTF-8 is converted into the default 8-bit encoding, if that can't be done then it skips the test.
Having thought a bit more, I don't think '\u' escapes are an improvement, there are probably compilers that can compile cppunit that won't accept '\u'.
So I think I'd like to leave it as it is, unless you've got objections.
Mike
env.diff
(1.6 KiB)Also some of the files in src/regex aren't needed any more:
engine.ih regcomp.ih regerror.ih regex.3 mkh WHATSNEW
And can some of the makefiles go? Aren't they replaced by the bakefiles?
Makefile makefile.sc makefile.vc makefile.wat regexCE.vcp regex.dsp regex.dsw regexM8.xml
Thanks again, Mike
Thanks, I applied the patch and removed makefile.. regex.ds? is needed for compatibility project file in src/wxWindows.ds?, .{vcp,xml} for some other not-yet-baked makefiles. Wouldn't removing the files that are present in upstream package cause problems if we ever do cvs import again? If not, then I'll remove them, if yes, then I'd obviously prefer to leave them there, but I'm not fluent enough in CVS to know this... Thanks!
I applied the patch and removed makefile.*.
Thanks.
Wouldn't removing the files that are present in upstream package cause problems if we ever do cvs import again? If not, then I'll remove them, if yes, then I'd obviously prefer to leave them there,
That's a good point: I've done some investigation, and I think it's ok to do it.
To investigate this is what I did:
set up our current situation locally (i.e. with an original version imported as RXSPENCER ALPHA_3_8, then the new upstream sources imported as RXSPENCER TCL_8_4_5 but no checkout -j, then just overwrote and commited what we've got now in head),
then I did 'cvs remove' for Makefile, WHATSNEW, *.ih, mkh and regex.3, and commited that.
then I tried making a change to the upstream sources and imported (as RXSPENCER TCL_FUTURE) and this time did a checkout -j as normal.
I thought one possible problem might be that removed files might reappear from the vendor branch, but that didn't happend.
So everything worked and I ended up with what we have now with the upstream changes merged in.
Does that seem reasonable, what do you think? Mike
Thanks, removed from CVS.
Issue migrated from trac ticket # 6480
priority: high
2004-02-04 19:57:08: chiclero created the issue
reg.diff
flags
Adds a wxRE_ADVANCED flags to wxRegEx.
This flags is only defined when building with the builtin library.
I've done that the simplest way possible: I've added a define WX_NO_REGEX_ADVANCED which is set by configure when using the system library (rather than the other way around i.e. having a HAVE_REGEX_ADVANCED define). Let me know if it's not adequate.
Also adds a flag wxRE_BEST which is defined to be the best flavour available (extended or advanced).
I've also made: wxRE_DEFAULT = wxRE_BEST. I have a feeling everyone will disagree, if so that can be changed (a pity).
docs
I've updated the wxRegEx documentation, and also translated Henry Spencer's man page into a wxWindows 'Topic Overview'.
Translating a man page on regexs was difficult since every squigle is crucial, but hopefully it has worked out well. It looks clear here at least. If it doesn't work out so well on other people's computers/monitors then let me know and maybe I can try adjusting the bold/italics or something.
other stuff
I've fixed up wxRegEx so that it can use the system library in Unicode mode (on Unix). I can't see a reason to not allow it, though I've left the default as system for non-Unicode and builtin for Unicode.
I've updated the regex.bkl to disable the right set of warnings for BCC (having tried to verify it's ok to do so).
regex_lib.diff and regex_upstream.zip
Testing using BCC 5.5 has turned up some issues, and since my previous patch never got applied, I've made this patch relative to the current HEAD, i.e. this patch obsoletes the previous one.
regex_upstream.zip contains the upstream sources (already imported). They should be unzipped into src/regex, then the patch regex_lib.diff applied to them.