eddic / fastcgipp

fastcgi++: A C++ FastCGI and Web development platform:
https://fastcgipp.isatec.ca
GNU Lesser General Public License v3.0
310 stars 94 forks source link

Fix record.size() #48

Closed Erroneous1 closed 6 years ago

Erroneous1 commented 6 years ago

Use max chunk of 0xfff8 (0xffff rounded down to nearest chunk size) Use same formula for calculating record_size everywhere

It looks like where every other function was doing sizeof(Header) + (data size rounded down to nearest chunk size), dump(std::istream&) was doing (data size + sizeof(Header)) rounded down to nearest chunk size.

eddic commented 6 years ago

I like the idea of having single function do that math but after mulling over it for a few days I'm not convinced the math is correct.

The maximum content length is in fact 0xffff; not 0xfff8. It is the total record size that should be a multiple of chunkSize giving a max record size of 0xffff(content)+0x1(padding)+0x8(header)=0x10008. So I would argue that all the 0xfff8s should in fact stay as 0xffffs. Now this is certainly not to say the code was correct before but I think we should keep the 0xfff8s as 0xffff.

As far as I can tell, changing the 0xfff8s to 0xffffs only really breaks dump(const char* data, size_t size) right? This one could potentially end up with a content length of 0x10000. Can this be solved by changing

        header.contentLength = std::min(
                size,
                record.size()-sizeof(Protocol::Header));

to

        header.contentLength = std::min(size, 0xffff)

?

Argh. This is getting complicated. Anyway. Apologies I'm just going to usurp this issue. Try out commit aabf0a0a and see if it sorts everything. I believe this should put this one to rest now.

Erroneous1 commented 6 years ago

Ok so I looked at some specs and yeah, the content length can be up to 0xffff. You then take the length of the entire record without 0 padding and pad it out to your chunk size.

I'll work on this tomorrow with my constexpr branch. I think I'll also return the padding size. I can use C++14 constexpr with loops, temporary variables, etc, right?

eddic commented 6 years ago

Yeah a I'd considered returning the padding as well. The math seems quite consistently record.size()-header.contentLength-sizeof(header). Did you try aabf0a0? Does it help?

Erroneous1 commented 6 years ago

I'd like to make a test for this. Would you recommend looking at tests/transceiver.cpp? I figure it'd basically make a string that's more than 65535 bytes long with incrementing value, "send it", and verify on the other end that we got the right number of characters of the right value.

Erroneous1 commented 6 years ago

I finally tried the latest master branch (sorry for the delay). It does appear to work. When I implement this as constexpr, do you want to keep it accessible from the header or just put it in the implementation file?

eddic commented 6 years ago

Yeah I don't feel there'd be any performance gains to be made from inline this particular function would there? Might as well just leave it as a subroutine.

Erroneous1 commented 6 years ago

Performance is hard to comment about without having concrete numbers and benchmarks, and it will change over time as compilers get better. I played with it in compiler explorer with a simple function that makes a std::unique_ptr<char[]> when given a contentLength, and the code it generates when it knows the implementation (same file) is identical to constexpr (of course). When it is a subroutine it does a total of 8 extra instructions (GCC 7.2) or 4 extra instructions (Clang 6.0.0) once you add the content of getRecordSize to the code.

Excluding labels and the instructions for getRecordSize(), instruction count was 28, 21 (GCC,Clang) for subroutine, and 26,23 for constexpr for a single use. The compilers seem to think it'd be faster.

All in all I'd guess code size would stay about the same and performance would probably be imperceptibly faster using constexpr instead of keeping as subroutine. Personally I like to make constexpr anything that I can even if it will only be called at runtime. If the compiler deems it faster to inline then it can, but if it deems it better to not inline (maybe -Os) then it can also do that.

eddic commented 6 years ago

Well I don't doubt that inlining/constexpring getRecordSize() would ultimately lead to a lower instruction count but I question the value of said lower instruction count given the context. I just can't imagine it contributing in any way shape or form to overall performance gains. Lately I've been on a kick of trying not to over inline. I found I'd made the 2.x fastcgi++ over inlined and the compile process really annoyed me.

Anyway, my vote would be on not inlining/constexpring this specific function but if it's something you really feel strongly about then you are welcome to move it into the header file should you so desire.

eddic commented 6 years ago

I'd like to make a test for this. Would you recommend looking at tests/transceiver.cpp? I figure it'd basically make a string that's more than 65535 bytes long with incrementing value, "send it", and verify on the other end that we got the right number of characters of the right value.

I agree that a test is in order. Perhaps in the protocol test?