tmtoku commented 2 years ago

I tried to use #pragma in DSL, but an error has occured.

For example, I inserted #pragma unroll 2 in the second line of sample/c++/Nbody/kernel.pikg.

F64 eps2
#pragma unroll 2
rij = EPI.pos - EPJ.pos
r2 = rij * rij + eps2
r_inv  = rsqrt(r2)
r2_inv = r_inv * r_inv
mr_inv  = EPJ.mass * r_inv
mr3_inv = r2_inv * mr_inv
FORCE.acc -= mr3_inv * rij
FORCE.pot -= mr_inv

Then, I ran make and got the following error message.

conversion type: reference
epi name: Particle
epj name: Particle
force name: Particle
class file: particle.hpp
output file name: kernel.hpp
input file: kernel.pikg
/home/tom/pikg/src/gen_hash.rb:20:in `block (2 levels) in fusion_iotag': undefined method `fusion_iotag' for #<Pragma:0x0000558cf91a1dd8 @name="unroll", @option=["2"]> (NoMethodError)
    from /home/tom/pikg/src/gen_hash.rb:19:in `each'
    from /home/tom/pikg/src/gen_hash.rb:19:in `block in fusion_iotag'
    from /home/tom/pikg/src/gen_hash.rb:18:in `each'
    from /home/tom/pikg/src/gen_hash.rb:18:in `fusion_iotag'
    from /home/tom/pikg/src/gen_hash.rb:313:in `generate_alias'
    from /home/tom/pikg/src/parserdriver.rb:1681:in `<top (required)>'
    from ../../..//bin/pikg:2:in `require_relative'
    from ../../..//bin/pikg:2:in `<main>'
make: *** [Makefile:50: kernel.hpp] Error 1

If we add the following fusion_iotag method to the Pragma class, the error will no longer occur, but the generated kernel will not be changed by #pragma.

def fusion_iotag(iotag)
  []
end

https://github.com/FDPS/PIKG/blob/751ef58ac472acd648715a2024fe004f58877d59/src/intermediate_exp_class.rb#L754-L780

subarutaro commented 2 years ago

Sorry for late reply. Now I can’t try your code because I am in vacation, but please try —conversion-type A64FX option. Pragma only works for A64FX mode.

tmtoku commented 2 years ago

Thank you for your reply.

I tried to run it in A64FX mode, but I still get the same error.

command

../../..//bin/pikg --conversion-type A64FX --epi-name Particle --epj-name Particle --force-name Particle --class-file particle.hpp --output kernel.hpp -i kernel.pikg

result

conversion type: A64FX
epi name: Particle
epj name: Particle
force name: Particle
class file: particle.hpp
output file name: kernel.hpp
input file: kernel.pikg
/home/tom/pikg/src/gen_hash.rb:20:in `block (2 levels) in fusion_iotag': undefined method `fusion_iotag' for #<Pragma:0x0000559c79605e08 @name="unroll", @option=["2"]> (NoMethodError)
    from /home/tom/pikg/src/gen_hash.rb:19:in `each'
    from /home/tom/pikg/src/gen_hash.rb:19:in `block in fusion_iotag'
    from /home/tom/pikg/src/gen_hash.rb:18:in `each'
    from /home/tom/pikg/src/gen_hash.rb:18:in `fusion_iotag'
    from /home/tom/pikg/src/gen_hash.rb:313:in `generate_alias'
    from /home/tom/pikg/src/parserdriver.rb:1681:in `<top (required)>'
    from ../../..//bin/pikg:2:in `require_relative'
    from ../../..//bin/pikg:2:in `<main>'

subarutaro commented 2 years ago

Hi, I am sorry for very late reply. I finally came back from vacation.

To utilize unroll and loop_fission pragma, see below:

adding fusion_iotag for Pragma class (as you did, this modification will be included to next release).
use both --conversion-type A64FX and --strip-mining N (N should be size of strip mining, e.g. 24) options.

Please use pragma for unroll and loop_fission below (N should be numbers you want to set):

pragma unroll N

pragma statement loop_fission_point

please note that N for loop unrolling must be divisor of strip mining size.

tmtoku commented 2 years ago

Thank you! By adding '--strip-mining' option, it worked.

However, there is a syntax error in the innermost j-loop of the generated code.

"kernel.hpp", line 150: error: no suitable conversion function from "const PIKG::F64vec" to "float64_t" exists
  EPJ_pos_swpl0.v0 = svdup_n_f64(epj[(j+jj_swpl0)].pos).v0;

I think we need to call svdup_n_f64x3 instead of svdup_n_f64.

When type is F64vec, get_type_suffix_a64fx(type) returns f64, but the suffix of svdup_n_ should be f64x3. https://github.com/FDPS/PIKG/blob/751ef58ac472acd648715a2024fe004f58877d59/src/A64FX.rb#L89-L115

subarutaro commented 2 years ago

Thanks for the feedback. In our implementation, the vector operation is divided into scalar operations before actual kernel generation. Thus, we should have "EPJ_pos_swpl0.v0 = svdup_n_f64(eps[(j+jj_swpl0)].pos.x)" (and same expression for v1 and v2 as well) here. Let me check and fix this.

tmtoku commented 2 years ago

Okay, I understand. For now, I was able to correct the generated code by running the following (vi) commands.

:%s/).v0/.x)
:%s/).v1/.y)
:%s/).v2/.z)

FDPS / PIKG

Can' t use #pragma in DSL #6

command

result

pragma unroll N

pragma statement loop_fission_point