akwei / memcached

Automatically exported from code.google.com/p/memcached
0 stars 0 forks source link

Alignment check on in configure.ac is broken and results in addition memcpy()s #360

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Build memcached
2. NEED_ALIGN is always set to 1
3. The memcached code does an unneeded memmove

What is the expected output? What do you see instead?
NEED_ALIGN should be set to 0 on x86 and all modern ARM systems (ARMv6+)

What version of the product are you using? On what operating system?
head of git repo on linux 

Please provide any additional information below.

The line of the configure script is incorrect. It looks like the author wanted 
the numbers to be sequentially increasing, however 2 is repeated twice. 
 *buf =  1; *(buf +1) = 2; *(buf + 2) = 2; *(buf + 3) = 3; *(buf + 4) = 4;

The test for the buffer is incorrect, i should be dereferenced (*i).
 return (84148994 == i) ? 0 : 1;

Thanks,
Ali

Original issue reported on code.google.com by asa...@gmail.com on 16 Apr 2014 at 3:55

GoogleCodeExporter commented 9 years ago
Correct, it should be something like:

*buf =  1; *(buf +1) = 2; *(buf + 2) = 3; *(buf + 3) = 4; *(buf + 4) = 5;

Original comment by erichase...@gmail.com on 16 Apr 2014 at 6:00

GoogleCodeExporter commented 9 years ago
the pointer needs to be dereferenced for the compare as well: (84148994 == *i)

Original comment by asa...@gmail.com on 16 Apr 2014 at 7:50

GoogleCodeExporter commented 9 years ago
I was looking at this PR during my last round of merges:

https://github.com/memcached/memcached/pull/3

.. but didn't have any bigendian hardware to verify it on.

Original comment by dorma...@rydia.net on 16 Apr 2014 at 7:53

GoogleCodeExporter commented 9 years ago
It looks like that to pull request fixes the issue for little endian systems at 
least. The bug effects all systems big or little endian that don't need aligned 
accesses today. I don't have a big endian system either, but the fix seems to 
address the problem for both ARM and x86.

Original comment by asa...@gmail.com on 16 Apr 2014 at 11:12

GoogleCodeExporter commented 9 years ago
I'd like to test it first, if anyone's listening and has a machine available.

We had/have a sun box but it's been dead for a long time. Was going to set up a 
PPC VM but haven't done that yet.

Original comment by dorma...@rydia.net on 16 Apr 2014 at 11:16

GoogleCodeExporter commented 9 years ago
That would be nice, however the broken code, as it currently exists in the 
repository now, is 10-20% slower (depending on the particular machine I've 
tried) than the fixed code, so it's a non-trivial performance improvement to 
merge this, even if BE machines still have the issue. 

Original comment by asa...@gmail.com on 22 Apr 2014 at 9:01

GoogleCodeExporter commented 9 years ago
I'd like a regression test too (and a pony, and a million dollars).

Since I goofed something in .18 I think we'll be doing another release shortly. 
I'd really like this to go in but I really really want someone to just try it 
on a big endian machine. performance is pointless if it won't build, and many 
people won't report a problem.

but if I can rouse anyone anywhere with a BE machine maybe the best course is 
to stop caring. :/

Original comment by dorma...@rydia.net on 22 Apr 2014 at 9:10

GoogleCodeExporter commented 9 years ago
I've cleaned up dago's pull request into a clean, foolproof, AC_LANG_PROGAM. 

https://github.com/emcconville/memcached/commit/e55d4adfb7fd159c8b672bc5f428e811
1714e8b8

I have compiled & executed on powerpc, but haven't created a pull request as 
testapp always seems to fail @ "memcached.c:4252: drive_machine: assertion `0' 
failed".

How is the current issue 10-20% slower? Even if a false-nagitive exists, the 
condition "((long)(c->rcurr)) % 8 != 0" should prevent memmove from ever being 
called (unless it's needed).

https://github.com/memcached/memcached/blob/master/memcached.c#L3667

Original comment by erichase...@gmail.com on 23 Apr 2014 at 2:27

GoogleCodeExporter commented 9 years ago
Do the tests pass on powerpc under vanilla 1.4.18?

I think under binary protocol benchmarks the memmove calls end up happening 
quite a lot. At least, I remember seeing it happen but never went back to fix 
the issue :/ binprot is the only place where the NEED_ALIGN marker exists I 
think?

If you do something like a multiget or multiset it'll tend to call that memmove 
inbetween each command.

Original comment by dorma...@rydia.net on 23 Apr 2014 at 4:37

GoogleCodeExporter commented 9 years ago
Yes, I believe scope of change would be isolated to the binary protocol. 
Anyway, same results with vanilla 1.4.18. However, binary(-get).t seems to 
complete in both instances. I'll run & attach `prove' results this evening. Is 
there a test case for multiget & multiset? In my opinion NEED_ALIGN is 
completely safe, so I'm little hesitant of introducing a false-positive.

Original comment by erichase...@gmail.com on 23 Apr 2014 at 1:40

GoogleCodeExporter commented 9 years ago
I was using mc-crusher to test binary mgets and msets. There're multiget tests 
in t/binary.t as well.

I have no idea when that assert started under testapp for powerpc :/ I think we 
need to track that down first.

Original comment by dorma...@rydia.net on 24 Apr 2014 at 1:36

GoogleCodeExporter commented 9 years ago
Attached are the results of running `prove' (prove-vanillia.tgz is clean 
1.14.18, and prove-withfix.tgz includes the patch e55d4a). I haven't dug deep 
into the results, but the t/binary.t seems to be good. I'll execute 
mc-crusher's conf/binconf next, and continue to investigate the assertion issue.

Original comment by erichase...@gmail.com on 24 Apr 2014 at 3:55

Attachments:

GoogleCodeExporter commented 9 years ago
I don't think I see the testapp results in here? it's just the perl bits. half 
of the tests are in that C blob called testapp, which is why we have to run 
make test.

The rest seems good. I think we need to find out why the testapp broke, even if 
it's unrelated to this.. since otherwise we won't have full test coverage for 
this problem.

Original comment by dorma...@rydia.net on 24 Apr 2014 at 8:05

GoogleCodeExporter commented 9 years ago
Complete "make test" attached. I really couldn't pinpoint why testapp broke, 
but I was able to complete the test suite by adding an additional "usleep(1)" 
after "safe_send" call in "test_binary_quite_impl" function. I'm thinking the 
previously observed issues are more related to old hardware being old hardware. 

Anyway, running a diff between testapp-memcached-1.4.18.log (vanilla), and 
test-app-memcached-360fix.log (patch) didn't reveal major changes in behavior / 
performance. 

Original comment by erichase...@gmail.com on 25 Apr 2014 at 4:01

Attachments:

GoogleCodeExporter commented 9 years ago
Looks fine. can you submit the patches and I'll merge? I'd like to do a .19 
release in the next day or two.

Can you clear up a bit what the assertion was for testapp? You said it was 
asserting in memcached.c, not testapp? I'm not sure what git revision that was 
from since :4252 isn't an assert for me.

Original comment by dorma...@rydia.net on 25 Apr 2014 at 7:17

GoogleCodeExporter commented 9 years ago
Pull request created. 

https://github.com/memcached/memcached/pull/68

I'll continue working on the assertions, and open a new issue with more 
meaningful finds + gdb data. My initial thinking is that the testapp performs a 
`read' on socket while the daemon is still driving the previous `safe_send'. 
For :4252, the assert seems to be working as expected.

Original comment by erichase...@gmail.com on 26 Apr 2014 at 2:17

GoogleCodeExporter commented 9 years ago
Merged into master, good for the forthcoming 1.4.19 release.

Should still probably patch testapp but can happen whenever/separately. Pretty 
big performance boost on binprot from it.

Thanks, all! I really appreciate the help.

Original comment by dorma...@rydia.net on 28 Apr 2014 at 6:48