Closed ampli closed 9 years ago
@ampli did you check that this patch makes the code size smaller? If so, on which architecture?
Well, it makes it larger. By fair amount. Due to EMIT() macro. But given its body size, I don't think that anything will help with - e.g. replacing with a function call won't give much gain because of need to pass extra arg to it (well, mere passing args is already a size-killer).
Then it's just common sense that doing sizecode vs compilecode check once takes less space than doing it:
$ grep EMIT compilecode.c | wc -l
32
times.
What can be done there more? Well, _compilecode could be turned into a macro and instantiated twice for size* and compile* part. That will make it severely unreadable and hard to modify, which is already against the original goal of making it less error-prone to modify parser. And then it's only next step to see that size* counterpart needs to do much less work than actual compile, so worth be written in adhoc manner. And that's how we arrive at the current implement.
@ampli : So, I'm afraid maintaining separate sizecode and compilecode is part of the game of staying the minimal possible regexp implementation.
The patch produces a slightly bigger code (i686), checked with -Os.
$ size compilecode.o
Before:
text data bss dec hex filename
1624 0 0 1624 658 compilecode.o
After:
text data bss dec hex filename
1756 0 0 1756 6dc compilecode.o
I.e. it actaully adds 132 bytes.
But I think this addition is a must nevertheless, because the current code has a major flaw: Even if the regexp contains an error, the current re1_5_sizecode never fails and returns a positive size. In such a case, there is then no guarantee that _compilecode() will always (including future additions) behave exactly the same and not overflow the allocated buffer. (And besides this critical problem, there are the benefits indicated in the commit message.)
In addition, there are features that are to be added to the VM code generation: {n,m}, Unicode support, non-greedy "?" (only non-greedy "*" and "+" has been added and ?? has been neglected), etc. This has the potential to compensate for the these added 132 bytes because there will not be a need to add any parallel code (or any code at all) to re1_5_sizecode(), while using the current scheme there will be a need to add code to both.
BTW, if saving such a code size (~130 bytes) is significant, I can try to reduce the generated code of the new _compilecode(). Changing insert_code() to a macro (like EMIT) saves 44 bytes. Defining pc
to (prog->bytelen)
saves another 24 bytes. Changing the EMIT and insert_code macros to check !code
instead of sizecode
reduces yet another 19 bytes. These changes also make the code faster. Total reduction by these changes: 87 bytes (not yet 132...). I validated that it still passes the tests after these changes.
Ah, size before and after (x86_64):
text data bss dec hex filename
12405 720 72 13197 338d re
text data bss dec hex filename
12533 720 72 13325 340d re
BTW, if saving such a code size (~130 bytes) is significant, I can try to reduce the generated code of the new _compilecode().
Yes, that's big amount for us. 16 bytes we can neglect, 32 put up with, but adding 100+ bytes here and there will make micropython (and re1.5 itself) not competitive in their niches.
So, if you can apply optimizations like you describe to compensate size increase, that would be very nice compromise. Using i686 for size comparisons is good things too, due to it passing args on stack.
Ok, I will send a pull request with the new compilecode.c that includes these code size optimizations. BTW, now that I am aware of how much a saving of 100 bytes is appreciated in this project, I will try to look at the rest of the code for similar savings.
Is it better that I will resend an altogether new pull request, or instead just add a pull request for these 3 optimizations?
If you can save around 100 bytes with optimisations then I'd rather keep those 100 bytes than using them up straight away to combine calc size into compilecode :)
BTW, now that I am aware of how much a saving of 100 bytes is appreciated in this project, I will try to look at the rest of the code for similar savings.
Well, we don't want to burden you with size optimizations on their own, but if making refactors which increase size, looking how to offset them back to stay close to the original is appreciated.
Another thing to keep in mind is stack usage. Why my question about static'izing "ByteProg dummyprog". It adds 16 bytes of stack usage, and again, in embedded environment, it's wealth of bytes (consider 1K is not being a small stack). For compile() case, it's not important - we can assume that people should pre-compile their stuff if they care about efficiency, but for match it would be concern, especially if can be called recursively.
than using them up straight away to combine calc size into compilecode :)
Well, @ampli gives good reasons why combining is beneficial. We're going to spend more size anyway for new features, and if code duplication blocks him from implementing/contributing these new features, what can be said, especially if he points at DoS/attack vectors with current code. The only thing to ask is that if new features will be #ifdef'ed away, that ~ old size was maintained ;-).
This has the potential to compensate for the these added 132 bytes because there will not be a need to add any parallel code (or any code at all) to re1_5_sizecode(), while using the current scheme there will be a need to add code to both.
This is not necessarily true: the current scheme (2 separate functions) is smaller than the combined scheme. Therefore, if you add new features then each feature might take more space in the combined scheme than in the separated scheme.
Is it better that I will resend an altogether new pull request, or instead just add a pull request for these 3 optimizations?
So, you see, you commit your changes to master, and this pull request already contains unrelated changes. Ideally, each new feature should be added to a separate branch, and reactions to issues raised during review, should come as separate commits on that branch, to ease further review. Once everything is done it usually makes sense to squash stuff together (depends, you can leave that to me).
The above of course scale worse when working on many features in parallel, so I'm from my side is eager to have short review cycles ;-).
This is not necessarily true: the current scheme (2 separate functions) is smaller than the combined scheme. Therefore, if you add new features then each feature might take more space in the combined scheme than in the separated scheme.
I guess that absolute difference will stay about the same, but relative percentage will diminish, while effort to maintain it grow with "higher velocity".
I have just sent a new pull request (#14) that includes the previous code + space optimization. I am closing this discussion and we can continue there if needed.
See the commit message for the benefits.
Now re1_5_sizecode() returns -1 on regexp error. main() somehow already checks for that(!), even though the original re1_5_sizecode() didn't check for errors and could not return -1 at all. However, MicroPython's mod_re_compile() needs a fix.