Closed nealef closed 7 years ago
Thanks for the patch! Currently traveling, but will review soon.
Changes pushed, comments inline -
@sbahra requested changes on this pull request.
Thanks for the patch! I left a few notes, I'll do another scan once addressed.
In include/gcc/s390x/ck_pr.hhttps://github.com/concurrencykit/ck/pull/98#discussion_r119541521:
include "ck_f_pr.h"
*
*
Is there a point to full serialization for ck_pr_stall? Either a no-op or if you really want to serialize, a load fence is probably sufficient here.
This operation turns into a one instruction 07e0 – a no-op that causes serialization.
In include/gcc/s390x/ck_pr.hhttps://github.com/concurrencykit/ck/pull/98#discussion_r119541639:
柳_pr_stall(void) { __sync_synchronize(); return; }
CK_CC_INLINE static void \
ck_pr_fence_strict_##T(void) \
{ \
__sync_synchronize(); \
}
*
Let's remove these comments not relevant to this target.
In build/ck.build.s390xhttps://github.com/concurrencykit/ck/pull/98#discussion_r119542219:
@@ -0,0 ,2 @@ 䬀撾=-m64 -O0 -g
These don't match production build flags elsewhere. How come -O0 is being used? If you're running into regression failures with optimization on, it probably indicates something is wrong in the ck_pr port below.
D’Oh. Left it in when I was examining the code being generated.
In include/gcc/s390x/ck_pr.hhttps://github.com/concurrencykit/ck/pull/98#discussion_r119542278:
柳_pr_faa_ptr(void *target, uintptr_t delta) { uintptr_t previous;
previous = __sync_fetch_and_add((uintptr_t *) target, delta);
return (void *)(previous);
}
CK_CC_INLINE static T \
ck_pr_faa_##S(T *target, T delta) \
{ \
T previous; \
\
previous = __sync_fetch_and_add(target, delta); \
Would there be an advantage to using the actual assembly here? Is __sync_fetch_and_add overly strict on S390X? I'd love insights into the generated assembly.
On the latest level of s390x hardware this is a one instruction: load-and-add:
"The second operand is added to the third operand, and the sum is placed at the second-operand location. Subsequently, the original contents of the sec- ond operand (prior to the addition) are placed unchanged at the first-operand location.
"All accesses to the second-operand location appear to be a block-concurrent interlocked-update refer- ence as observed by other CPUs and the I/O subsystem. A specific-operand-serialization operation is performed.”
On earlier s390x hardware this becomes the typical compare-and-swap sequence. Leaving it as a builtin means we don’t have to provide both asm versions.
In include/gcc/s390x/ck_pr.hhttps://github.com/concurrencykit/ck/pull/98#discussion_r119542351:
@@ -0,0 ,372 @@ *
- Copyright 2009-2015 Samy Al Bahra.
You should add yourself to the Copyright here as well.
Added.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/concurrencykit/ck/pull/98#pullrequestreview-41429113, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAS2-mHth5rHAp99QwK3j453zXOgSvdaks5r_mTzgaJpZM4NbqUd.
Looks good. @nealef - Are you willing to be the s390x maintainer for Concurrency Kit? At that point, it makes sense to merge this down as it's a platform we can have ongoing support for. It should be a low maintenance task.
Sure. I do so for others.
Thanks
-------- Original message -------- From: Samy Al Bahra notifications@github.com Date: 6/10/17 15:10 (GMT-05:00) To: concurrencykit/ck ck@noreply.github.com Cc: Neale Ferguson neale@sinenomine.net, Mention mention@noreply.github.com Subject: Re: [concurrencykit/ck] Add s390x support (#98)
Looks good. @nealefhttps://github.com/nealef - Are you willing to be the s390x maintainer for Concurrency Kit? At that point, it makes sense to merge this down as it's a platform we can have ongoing support for. It should be a low maintenance task.
- You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/concurrencykit/ck/pull/98#issuecomment-307584611, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAS2-oXJYCbEztyiLsyXJIq1rTlBiqneks5sCuorgaJpZM4NbqUd.
This PR adds s390x (aka "The Mainframe") support to ck. s390x is a big endian system so ck_ht_hash.h has logic added to perform byte swapping. With these changes all the regression tests pass on a CentOS 7.3.1611 system. I think I inadvertently added the generated file ck_f_pr.h to the push. I can remove and repush if that is necessary.