pfalcon / re1.5

re1, the unbloated regexp engine by Russ Cox, elaborated to be useful for real-world applications
BSD 3-Clause "New" or "Revised" License
42 stars 4 forks source link

compilecode.c: Use _compilecode() to calculate the code size. #12

Closed ampli closed 9 years ago

ampli commented 9 years ago

See the commit message for the benefits.

Now re1_5_sizecode() returns -1 on regexp error. main() somehow already checks for that(!), even though the original re1_5_sizecode() didn't check for errors and could not return -1 at all. However, MicroPython's mod_re_compile() needs a fix.

dpgeorge commented 9 years ago

@ampli did you check that this patch makes the code size smaller? If so, on which architecture?

pfalcon commented 9 years ago

Well, it makes it larger. By fair amount. Due to EMIT() macro. But given its body size, I don't think that anything will help with - e.g. replacing with a function call won't give much gain because of need to pass extra arg to it (well, mere passing args is already a size-killer).

Then it's just common sense that doing sizecode vs compilecode check once takes less space than doing it:

$ grep EMIT compilecode.c | wc -l
32

times.

What can be done there more? Well, _compilecode could be turned into a macro and instantiated twice for size* and compile* part. That will make it severely unreadable and hard to modify, which is already against the original goal of making it less error-prone to modify parser. And then it's only next step to see that size* counterpart needs to do much less work than actual compile, so worth be written in adhoc manner. And that's how we arrive at the current implement.

@ampli : So, I'm afraid maintaining separate sizecode and compilecode is part of the game of staying the minimal possible regexp implementation.

ampli commented 9 years ago

The patch produces a slightly bigger code (i686), checked with -Os. $ size compilecode.o Before:

   text    data     bss     dec     hex filename
   1624       0       0    1624     658 compilecode.o

After:

   text    data     bss     dec     hex filename
   1756       0       0    1756     6dc compilecode.o

I.e. it actaully adds 132 bytes.

But I think this addition is a must nevertheless, because the current code has a major flaw: Even if the regexp contains an error, the current re1_5_sizecode never fails and returns a positive size. In such a case, there is then no guarantee that _compilecode() will always (including future additions) behave exactly the same and not overflow the allocated buffer. (And besides this critical problem, there are the benefits indicated in the commit message.)

In addition, there are features that are to be added to the VM code generation: {n,m}, Unicode support, non-greedy "?" (only non-greedy "*" and "+" has been added and ?? has been neglected), etc. This has the potential to compensate for the these added 132 bytes because there will not be a need to add any parallel code (or any code at all) to re1_5_sizecode(), while using the current scheme there will be a need to add code to both.

BTW, if saving such a code size (~130 bytes) is significant, I can try to reduce the generated code of the new _compilecode(). Changing insert_code() to a macro (like EMIT) saves 44 bytes. Defining pc to (prog->bytelen) saves another 24 bytes. Changing the EMIT and insert_code macros to check !code instead of sizecode reduces yet another 19 bytes. These changes also make the code faster. Total reduction by these changes: 87 bytes (not yet 132...). I validated that it still passes the tests after these changes.

pfalcon commented 9 years ago

Ah, size before and after (x86_64):

   text    data     bss     dec     hex filename
  12405     720      72   13197    338d re
   text    data     bss     dec     hex filename
  12533     720      72   13325    340d re
pfalcon commented 9 years ago

BTW, if saving such a code size (~130 bytes) is significant, I can try to reduce the generated code of the new _compilecode().

Yes, that's big amount for us. 16 bytes we can neglect, 32 put up with, but adding 100+ bytes here and there will make micropython (and re1.5 itself) not competitive in their niches.

So, if you can apply optimizations like you describe to compensate size increase, that would be very nice compromise. Using i686 for size comparisons is good things too, due to it passing args on stack.

ampli commented 9 years ago

Ok, I will send a pull request with the new compilecode.c that includes these code size optimizations. BTW, now that I am aware of how much a saving of 100 bytes is appreciated in this project, I will try to look at the rest of the code for similar savings.

ampli commented 9 years ago

Is it better that I will resend an altogether new pull request, or instead just add a pull request for these 3 optimizations?

dpgeorge commented 9 years ago

If you can save around 100 bytes with optimisations then I'd rather keep those 100 bytes than using them up straight away to combine calc size into compilecode :)

pfalcon commented 9 years ago

BTW, now that I am aware of how much a saving of 100 bytes is appreciated in this project, I will try to look at the rest of the code for similar savings.

Well, we don't want to burden you with size optimizations on their own, but if making refactors which increase size, looking how to offset them back to stay close to the original is appreciated.

Another thing to keep in mind is stack usage. Why my question about static'izing "ByteProg dummyprog". It adds 16 bytes of stack usage, and again, in embedded environment, it's wealth of bytes (consider 1K is not being a small stack). For compile() case, it's not important - we can assume that people should pre-compile their stuff if they care about efficiency, but for match it would be concern, especially if can be called recursively.

pfalcon commented 9 years ago

than using them up straight away to combine calc size into compilecode :)

Well, @ampli gives good reasons why combining is beneficial. We're going to spend more size anyway for new features, and if code duplication blocks him from implementing/contributing these new features, what can be said, especially if he points at DoS/attack vectors with current code. The only thing to ask is that if new features will be #ifdef'ed away, that ~ old size was maintained ;-).

dpgeorge commented 9 years ago

This has the potential to compensate for the these added 132 bytes because there will not be a need to add any parallel code (or any code at all) to re1_5_sizecode(), while using the current scheme there will be a need to add code to both.

This is not necessarily true: the current scheme (2 separate functions) is smaller than the combined scheme. Therefore, if you add new features then each feature might take more space in the combined scheme than in the separated scheme.

pfalcon commented 9 years ago

Is it better that I will resend an altogether new pull request, or instead just add a pull request for these 3 optimizations?

So, you see, you commit your changes to master, and this pull request already contains unrelated changes. Ideally, each new feature should be added to a separate branch, and reactions to issues raised during review, should come as separate commits on that branch, to ease further review. Once everything is done it usually makes sense to squash stuff together (depends, you can leave that to me).

The above of course scale worse when working on many features in parallel, so I'm from my side is eager to have short review cycles ;-).

pfalcon commented 9 years ago

This is not necessarily true: the current scheme (2 separate functions) is smaller than the combined scheme. Therefore, if you add new features then each feature might take more space in the combined scheme than in the separated scheme.

I guess that absolute difference will stay about the same, but relative percentage will diminish, while effort to maintain it grow with "higher velocity".

ampli commented 9 years ago

I have just sent a new pull request (#14) that includes the previous code + space optimization. I am closing this discussion and we can continue there if needed.