ianhinder / Kranc

A Mathematica package for generating code for solving time dependent partial differential equations
http://kranccode.org
GNU General Public License v2.0
28 stars 10 forks source link

Sign does not work with vectorised code #64

Closed ianhinder closed 11 years ago

ianhinder commented 12 years ago

I believe that the current implementation of Sign in Kranc does not work with vectorised code. I worked around this problem by implementing the required conditional manually in the thorn script and avoiding use of Sign. Sign is currently implemented directly by the "sgn" function (MathematicaCompat.h) which is defined in GenericFD.h as

static inline CCTK_REAL sgn(CCTK_REAL x)
{
 return x==(CCTK_REAL)0.0 ? (CCTK_REAL)0.0 : copysign((CCTK_REAL)1.0, x);
}

Does an expert in vectorisation see why this would not work?

eschnett commented 12 years ago

This does not work because it accepts scalar input, and there is no functionality that converts this to a vector function, or which maps a scalar function onto vector elements.

The solution is to define a function ksgn in thorn Vectors, and to replace Sign with ksgn in Kranc when vectorizing.

ianhinder commented 12 years ago

Ah yes - that is fairly obvious :$ Now my question is why the compilation was successful? Any ideas?

eschnett commented 12 years ago

Grid functions are still stored as scalar arrays, not as arrays of vectors. The load/store operations then access multiple scalars at once.

If you apply sgn directly to an indexed grid function, sgn will return a scalar result. This will be replicated into a vector by default, as if subsequent array elements led to the same sign. Thus the code works.

If so, there is also a serious error, because grid functions must not be accessed directly, there must always be a vector load operation.

ianhinder commented 12 years ago

Now that I think about it, Sign was being applied to the "normal" array, which is not a grid function. So I'm not sure why it wasn't working. I will try to come up with a test case which shows that it fails, and then report back here. (some time)

eschnett commented 12 years ago

"normal" must be an array of vectors, not an array of scalars. Since we don't support int vectors, it should be a vector of reals.

rhaas80 commented 12 years ago

Currently my McLachlan BSSN fails to compile (this is on 4707e0c3f03cc52d82f88735e602c52cf74f0cbe):

/mnt/data/rhaas/postdoc/gr/Zelmani/arrangements/McLachlan/ML_BSSN/src/ML_BSSN_convertFromADMBaseGamma.cc:486:44: error: cannot convert ‘m128d {aka vector(2) double}’ to ‘ptrdiff_t {aka long int}’ in initialization

and the line in question is (the UNUSED is mine though it clearly does not matter):

CCTK_ATTRIBUTE_UNUSED ptrdiff_t dir1 = ksgn(beta1L);

Since using dir1 is an abuse of Sign to begin with, the proper fix seems to be to (a) make dir1 vector valued and (b) define a vector (of ptrdiff_t) typed Sign function that returns -1,0,1. Right now I don't know whether dir1 is even correctly computed for vectorised code or if this line is a bug since dir1 is computed only for (say) the first vector component and I agree with Erik that we would seem to have a bug, but one that is not so easily fixed since dir1,dir2 etc is a difference in index space but depends on the grid function value.

ianhinder commented 12 years ago

According to the automated tests (http://damiana2.aei.mpg.de/~ianhin/testreports/EinsteinToolkitTests/results.xml), the responsible commit was http://git.barrywardell.net/arrangements/LSUThorns/Vectors.git/commit/4b4418d0efb5b06784e89f617701c7b7cd5f6059:

author  eschnett <eschnett@105869f7-3296-0410-a4ea-f4349344b45a>    
Sat, 11 Aug 2012 22:55:54 +0200 (20:55 +0000)

Add ksgn function (vectorised version of Kranc's Sign)

All architectures: Add copysign and sgn functions.
Remove pos function (which does nothing).

Add support for Blue Gene/Q (QPX instructions).
Correct errors in AVX instructions.

The dirN variables are only used if splitUpwindDerivs = False, which is never the case in McLachlan_BSSN.m unless the source script is modified. The logic that usually removes unused shorthands is probably skipping shorthands which might be used in derivatives, as it is not clever enough to look up these usages.

Why is the use of Sign an abuse? Because we are converting between integers and reals?

ianhinder commented 12 years ago

I was wrong: the dirN variables ARE used. I have regenerated the code with the latest version of Kranc, but I still get a compilation failure:

/home/ianhin/Cactus/EinsteinToolkit/arrangements/McLachlan/ML_BSSN/src/ML_BSSN_Advect.cc(1079): error: a value of type "__m128d" cannot be used to initialize an entity of type "ptrdiff_t={long}" ptrdiff_t dir1 = ksgn(beta1L); ^

I don't know how to fix this.

Since using dir1 is an abuse of Sign to begin with, the proper fix seems to be to (a) make dir1 vector valued and (b) define a vector (of ptrdiff_t) typed Sign function that returns -1,0,1. Right now I don't know whether dir1 is even correctly computed for vectorised code or if this line is a bug since dir1 is computed only for (say) the first vector component and I agree with Erik that we would seem to have a bug, but one that is not so easily fixed since dir1,dir2 etc is a difference in index space but depends on the grid function value.

dir is not being used here as a difference in index space, unless splitUpwindDerivs = False. We compute the derivative as a linear combination of the symmetric and antisymmetric parts, and dir is used in the weights of these. As such, it could probably be real-valued. There is a horrible hack in Kranc which looks for the dirN variable and makes it ptrdiff_t instead of DataType[]. I'm experimenting with disabling that now, so that it ends up as CCTK_REAL_VEC. This will probably break splitUpwindDerivs = False, and hence might break the Georgia Tech code.

ianhinder commented 12 years ago

The proposed change to undo the ptrdiff hack is in the branch ianhinder/undoptrdiffhack (811178aed575c8ef0edd8de820c7766250490e06). I have regenerated McLachlan with this change and committed the result so that the ET builds again. What should we do in Kranc? Shall we undo the ptrdiff hack? This would mean that dirN becomes a CCTK_REAL[_VEC] again, and might break other code.

eschnett commented 12 years ago

Yes, this seems to be the correct solution. I tried it, and did not encounter any problems with the thorns I have.

Unvectorised code should have no trouble converting between integers and reals, and vectorised code can handle only reals.

rhaas80 commented 12 years ago

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Hello Erik, Ian,

Yes, this seems to be the correct solution. I tried it, and did not encounter any problems with the thorns I have. I agree with this and the "buse" I luded to in my original comment is the hard coded hack in Kranc to make any dirN variable a ptrdiff_t :-).

Unvectorised code should have no trouble converting between integers and reals, and vectorised code can handle only reals. I believe this is not true if the real valued quantity appear in an index computation of an array where C only allows integers.

So gxxL = hxx[idx+dir1*cdi] needs dir1 to be an integer type.

Yours, Roland


My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from http://keys.gnupg.net. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlAn7coACgkQTiFSTN7SboVkEQCgx9dGlfddhlnHEb+/5ON+rUPR 538AoI9lZ1E+A4Nef4nlaqcEeasToL2D =j58V -----END PGP SIGNATURE-----

ianhinder commented 12 years ago

I have memories of this being an issue before, but I can't remember the details. The hack was introduced (3232e12d5fc96e9d84c9e1e6931f698342df377b) by Erik, presumably as a result of this previous issue. I think some compiler complained about the array index being real-valued. Maybe a better solution would be to notice that a real-valued variable was being used in a finite difference operator and explicitly cast it to an integer.

rhaas80 commented 12 years ago

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 08/12/2012 12:43 PM, ianhinder wrote:

I have memories of this being an issue before, but I can't remember the details. The hack was introduced (3232e12d5fc96e9d84c9e1e6931f698342df377b) by Erik, presumably as a result of this previous issue. I think some compiler complained about the array index being real-valued. Maybe a better solution would be to notice that a real-valued variable was being used in a finite difference operator and explicitly cast it to an integer. The problem only surfaced when a macro was used for CCTK_GFINDEX3D(i,j,k) in which case the real-ness of dirN could propagate into the actual [] operator. Usually CCTK_GFIndex3D was a inlined function (but Cactus messed up detecting inline functions on Kraken). Right now there seem to be KRANC_XXX macros that construct a [] index using dirN directly.

I used to have a hack for Kranc where it would respect Mathematica's Assumptions method ie:

$Assumptions=Element[dir, Integers]

but I assume that this will not work with vectorized code.

Yours, Roland


My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from http://keys.gnupg.net. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlAoCi0ACgkQTiFSTN7SboWsNgCfXwQghtTHiG0ciaFxC2fcuy4w Ys8An1SzcmKyIZGVcD9j5/iCLSBwIgwu =Bxw/ -----END PGP SIGNATURE-----

eschnett commented 12 years ago

This will not work with vectorisation because there is currently no API for a gather operation, i.e. for loading values from memory where every vector element potentially comes from a different address (because dir[] may be different for every vector element).

We may have to introduce two versions of dir, an integer idir (to be used for index calculations) and a real-valued rdir (to be used for multiplying stencil coefficients).

tbode commented 12 years ago

The Georgia Tech code does indeed break with the latest Kranc patch. Another small point regarding Sign, though ... with revision 2c2bde2690d7e422e85d4441dbed0000de9140f8 the Kranc-generated code replaces Sign by CactusGenCode_Private_sgn. As I see it, sgn needs to be added to the first bracket in Tools/CodeGen/Kranc.m.

eschnett commented 12 years ago

How does the code break -- what is the error message?

tbode commented 12 years ago

As described above: We use dir as an index which is not acceptable with the current development Kranc and vectorization. Kranc generates code fine, but compilation attempts abort with: invalid types 'const char*[double]' for array subscript. for the lopsided stencils. Seems I need to modify how I set up the differencing operators.

eschnett commented 12 years ago

With vectorisation, you cannot use dir as an index. This has never worked, and if it compiled, the results were wrong.

However, if you see the error above (using double as index) without vectorisation, then this may be a new problem that we can correct.

tbode commented 12 years ago

I see the error without vectorization as well.

ianhinder commented 12 years ago

I think we should detect the fact that a shorthand is being used in an FD operator, and cast it explicitly to the real type.

eschnett commented 12 years ago

I would like to correct this problem as soon as possible (being guilty of breaking things). There seem to be three solutions:

  1. undo my recent commits
  2. add cast to (ptrdiff_t) in index operations
  3. redefine or add a second version of the sgn() function so that it returns an int, not a real Ian, what do you suggest? I would prefer 3.
ianhinder commented 12 years ago

3 implies also reinstating the dirN hack into Kranc, right? I'm not sure what the best long-term solution is; I would have to think about it. In the short term, we could reinstate the hack, and use a modified version of sgn that returns an int. Can you make this work both with vectorisation and without?

2 is bad because it causes a cast on every fd operation, whereas previously there was only one cast per loop iteration. Long term we might want to improve the intelligence of Kranc so it can use integers where necessary in temporary variables for index calculations. I'm happy with doing something a bit ugly in the short term to get all the codes working properly again.

eschnett commented 12 years ago

On Wed, Aug 15, 2012 at 5:45 PM, ianhinder notifications@github.com wrote:

3 implies also reinstating the dirN hack into Kranc, right? I'm not sure what the best long-term solution is; I would have to think about it. In the short term, we could reinstate the hack, and use a modified version of sgn that returns an int. Can you make this work both with vectorisation and without?

I can make the Sign function work both with and without vectorisation. However, using it in index calculations won't work with vectorisation, because there is no efficient hardware support for this operation on current CPUs.

-erik

2 is bad because it causes a cast on every fd operation, whereas previously there was only one cast per loop iteration. Long term we might want to improve the intelligence of Kranc so it can use integers where necessary in temporary variables for index calculations. I'm happy with doing something a bit ugly in the short term to get all the codes working properly again.

— Reply to this email directly or view it on GitHubhttps://github.com/ianhinder/Kranc/issues/64#issuecomment-7770167.

Erik Schnetter schnetter@gmail.com http://www.perimeterinstitute.ca/personal/eschnetter/

eschnett commented 12 years ago

The attached patch, together with a trivial implementation of kisgn that aborts with an error in LSUThorns/Vectors, solves the problem for me. Please have a look.

-erik

On Thu, Aug 16, 2012 at 6:59 AM, Erik Schnetter schnetter@gmail.com wrote:

On Wed, Aug 15, 2012 at 5:45 PM, ianhinder notifications@github.comwrote:

3 implies also reinstating the dirN hack into Kranc, right? I'm not sure what the best long-term solution is; I would have to think about it. In the short term, we could reinstate the hack, and use a modified version of sgn that returns an int. Can you make this work both with vectorisation and without?

I can make the Sign function work both with and without vectorisation. However, using it in index calculations won't work with vectorisation, because there is no efficient hardware support for this operation on current CPUs.

-erik

2 is bad because it causes a cast on every fd operation, whereas previously there was only one cast per loop iteration. Long term we might want to improve the intelligence of Kranc so it can use integers where necessary in temporary variables for index calculations. I'm happy with doing something a bit ugly in the short term to get all the codes working properly again.

— Reply to this email directly or view it on GitHubhttps://github.com/ianhinder/Kranc/issues/64#issuecomment-7770167.

Erik Schnetter schnetter@gmail.com http://www.perimeterinstitute.ca/personal/eschnetter/

Erik Schnetter schnetter@gmail.com http://www.perimeterinstitute.ca/personal/eschnetter/

eschnett commented 12 years ago

I committed a correction. Things should work again.

-erik

On Thu, Aug 16, 2012 at 12:38 PM, Erik Schnetter schnetter@gmail.comwrote:

The attached patch, together with a trivial implementation of kisgn that aborts with an error in LSUThorns/Vectors, solves the problem for me. Please have a look.

-erik

On Thu, Aug 16, 2012 at 6:59 AM, Erik Schnetter schnetter@gmail.comwrote:

On Wed, Aug 15, 2012 at 5:45 PM, ianhinder notifications@github.comwrote:

3 implies also reinstating the dirN hack into Kranc, right? I'm not sure what the best long-term solution is; I would have to think about it. In the short term, we could reinstate the hack, and use a modified version of sgn that returns an int. Can you make this work both with vectorisation and without?

I can make the Sign function work both with and without vectorisation. However, using it in index calculations won't work with vectorisation, because there is no efficient hardware support for this operation on current CPUs.

-erik

2 is bad because it causes a cast on every fd operation, whereas previously there was only one cast per loop iteration. Long term we might want to improve the intelligence of Kranc so it can use integers where necessary in temporary variables for index calculations. I'm happy with doing something a bit ugly in the short term to get all the codes working properly again.

— Reply to this email directly or view it on GitHubhttps://github.com/ianhinder/Kranc/issues/64#issuecomment-7770167.

Erik Schnetter schnetter@gmail.com http://www.perimeterinstitute.ca/personal/eschnetter/

Erik Schnetter schnetter@gmail.com http://www.perimeterinstitute.ca/personal/eschnetter/

Erik Schnetter schnetter@gmail.com http://www.perimeterinstitute.ca/personal/eschnetter/