McStasMcXtrace / McCode

The home of the McStas (neutrons) and McXtrace (x-rays) Monte-Carlo ray-tracing instrument simulation codes.
https://github.com/McStasMcXtrace/McCode/wiki
GNU General Public License v3.0
77 stars 54 forks source link

GPUhack follow-up: Investigate and improve SPLIT on GPU #967

Closed willend closed 3 years ago

willend commented 4 years ago

Subfolder SPLIT has semi-functional SPLIT provided as a hack - but looks like it is implemented in a semi non-vectorised way. (Symptoms intensity and stats are off the expected value due to "re-use" of the incoming particle.)

We should likely instead

willend commented 4 years ago

What is currently on the branch Enable-SPLIT-on-GPU-(cogen-changes) sort of works - with caveats.

Two examples of this are stored in subfolders SPLIT2 and SPLIT3 with different types of issues:

willend commented 4 years ago

@climbcat,@farhi I experimented with an edit on line 322 in cogen.c.in, which seems to have no effect on the above, any other good ideas are very welcome

index 7488096ef..3f2afc8de 100644
--- a/mcstas/src/cogen.c.in
+++ b/mcstas/src/cogen.c.in
@@ -322,7 +322,7 @@ static void cogen_defundef(struct comp_inst *comp, List l, char define_it)
           } /* for List */
         }
         if (flag_noconflict && strlen(c_formal->id))
-          coutf("  #define %s (instrument->_parameters.%s)",
+          coutf("  #define %s (_instrument_var._parameters.%s)",
             c_formal->id, c_formal->id);
         break;
       case PAR_UNDEF:
climbcat commented 4 years ago

Hi @willend, is the instr par used to read the split count/size? (I.e. the number that defaults to 10.)

EDIT: Ok, no this is comp->split...

climbcat commented 4 years ago

The code you wanted to find is located in the "cogen_comp_init_par" function, around line 534.

Having a look at the SPLIT2 and SPLIT3 experiments...

willend commented 4 years ago

Hmm, I am not sure it is around 534? I see no -> assignments there? The problematic (crashing SPLIT3/doesntwork) is

int SplitS_coll2 = instrument->_parameters.dummy;

i.e. the use of the instrument pointer with the dummy input parameter. What works is if I hack the code and write the below reference to the _instrument_var instead

int SplitS_coll2 = _instrument_var._parameters.dummy

(Or as shown in SPLIT3 top level hack for DECLARE parm initialised with extra pragmas in the instr file)

climbcat commented 4 years ago

Related to what you wrote int 1652bde6c9f61d824413a8c63dce79d63a8006a7 ...

A while back you changed the acc_attach( (void*)&_instrument_var ); to use the instrument var rather than the pointer. This seems to be a side effect of that...

Does everything work perfectly if we hard-code all split counts to be 10 for the moment?

willend commented 4 years ago

A hard-coded 10 seems to work in most cases indeed. (PSI_DMC has something to do with code in the Beamstop comp)

So you are suggesting an acc_attach( instrument); could work? I will try that ASAP.

willend commented 4 years ago

Update: The above acc_attach(instrument); seems to make no difference.

It would be really nice if we could ditch the use of instrument->_parameters.var and simply use _instrument_var._parameters.var, which does work. Just can't myself currently figure out where that code is written / generated... :-)

climbcat commented 4 years ago

Ok, wonder what shows up in a search of instrument->parameters in cogen.c.in and mccode-r* ?

willend commented 4 years ago

wrt. the above grep not a lot:

willend commented 4 years ago

Just ran a test from the said branch, so far it looks like these SPLIT-instrument work (compile):

once/McStas_GPU_PGCC_TESLA_KISS/HZB_NEAT/HZB_NEAT.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_D2B/ILL_D2B.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_D4/ILL_D4.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_H142_D33/ILL_H142_D33.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_H143_LADI/ILL_H143_LADI.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_H22_D1A/ILL_H22_D1A.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_H22_D1B/ILL_H22_D1B.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_H22_VIVALDI/ILL_H22_VIVALDI.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_H53_D16/ILL_H53_D16.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_IN5_Mantid/ILL_IN5_Mantid.out
once/McStas_GPU_PGCC_TESLA_KISS/ISIS_GEM/ISIS_GEM.out
once/McStas_GPU_PGCC_TESLA_KISS/LLB_6T2/LLB_6T2.out
once/McStas_GPU_PGCC_TESLA_KISS/PSI_DMC/PSI_DMC.out
once/McStas_GPU_PGCC_TESLA_KISS/PSI_Focus/PSI_Focus.out
once/McStas_GPU_PGCC_TESLA_KISS/SAFARI_MPISI/SAFARI_MPISI.out
once/McStas_GPU_PGCC_TESLA_KISS/SAFARI_PITSI/SAFARI_PITSI.out
once/McStas_GPU_PGCC_TESLA_KISS/Test_PowderN_Res/Test_PowderN_Res.out
once/McStas_GPU_PGCC_TESLA_KISS/Test_StatisticalChopper/Test_StatisticalChopper.out
once/McStas_GPU_PGCC_TESLA_KISS/h8_test_legacy/h8_test_legacy.out
once/McStas_GPU_PGCC_TESLA_KISS/linup-4/linup-4.out
once/McStas_GPU_PGCC_TESLA_KISS/linup-5/linup-5.out
once/McStas_GPU_PGCC_TESLA_KISS/linup-6/linup-6.out
once/McStas_GPU_PGCC_TESLA_KISS/linup-7/linup-7.out
once/McStas_GPU_PGCC_TESLA_KISS/templateDIFF/templateDIFF.out
once/McStas_GPU_PGCC_TESLA_KISS/templateNMX/templateNMX.out
once/McStas_GPU_PGCC_TESLA_KISS/templateNMX_TOF/templateNMX_TOF.out
once/McStas_GPU_PGCC_TESLA_KISS/templateSANS/templateSANS.out
once/McStas_GPU_PGCC_TESLA_KISS/templateSANS_Mantid/templateSANS_Mantid.out
once/McStas_GPU_PGCC_TESLA_KISS/templateTAS/templateTAS.out

And these don't:


ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/FZJ_BenchmarkSfin2/FZJ_BenchmarkSfin2.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/HZB_NEAT/2/HZB_NEAT.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ILL_BRISP/ILL_BRISP.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ILL_H15_IN6/ILL_H15_IN6.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ILL_H512_D22/ILL_H512_D22.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ILL_IN13/ILL_IN13.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ILL_IN4/ILL_IN4.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ILL_IN6/ILL_IN6.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ILL_Lagrange/ILL_Lagrange.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ISIS_SANS2d_Mantid/ISIS_SANS2d_Mantid.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/RITA-II/RITA-II.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/RTP_DIF/RTP_DIF.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/RTP_Laue/RTP_Laue.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/RTP_SANS/RTP_SANS.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/SNS_BASIS/SNS_BASIS.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/Test_PreMonitor_nD/Test_PreMonitor_nD.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/Union_test_texture/Union_test_texture.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/templateSasView/templateSasView.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/templateSasView_Mantid/templateSasView_Mantid.out: No such file or directory```
- of course the failing ones can easily be of completely different reason, an example is PSI_DMC where the Beamstop seems to be the cause.

In total we have 88 compiles (```find once -name \*.out | wc -l
88```)- just like on new-nightly, will have a look at which end op producing data and which don't.
willend commented 4 years ago

Test output in terms of data

Running tests...
BNL_H8                                :   0.87
BNL_H8_simple                         :   0.57
BTsimple                              :   NO COMPILE
Demo_shape_primitives                 :   NO TEST
ESS_2001_bispectral                   :   NO COMPILE
ESS_2015_test                         :   NO COMPILE
ESS_Brilliance_2001                   :   NO COMPILE
ESS_Brilliance_2013                   :   NO COMPILE
ESS_Brilliance_2014                   :   NO COMPILE
ESS_Brilliance_2015                   :   NO COMPILE
ESS_Brilliance_TDR                    :   NO COMPILE
ESS_IN5_reprate                       :   0.65
ESS_Testbeamline_HZB_V20              :   NO COMPILE
ESS_butterfly_Guide_curved_test       :   NO COMPILE
ESS_butterfly_MCPL_test               :   NO COMPILE
ESS_butterfly_test                    :   NO TEST
ESS_butterfly_tfocus_NOFOCUS_test     :   NO COMPILE
ESS_butterfly_tfocus_test             :   NO COMPILE
ESS_mcpl2hist                         :   NO COMPILE
FZJ_BenchmarkSfin2                    :   NO COMPILE
FZJ_KWS2_Lens                         :   0.59
FZJ_SANS_KWS2_AnySample               :   0.70
FZJ_SANS_KWS2_AnySample_2             :   0.59
FZJ_SANS_KWS2_AnySample_3             :   0.68
FZJ_SANS_KWS2_AnySample_4             :   0.59
Gallmeier_SNS_decoupled_poisoned      :   NO TEST
Granroth_SNS_decoupled_poisoned       :   NO COMPILE
HZB_FLEX                              :   NO TEST
HZB_NEAT                              :   7.79
HZB_NEAT_2                            :   1.22
Histogrammer                          :   NO COMPILE
ILL_BRISP                             :   NO COMPILE
ILL_D2B                               :   0.73
ILL_D4                                :   0.64
ILL_H10_IN8                           :   0.68
ILL_H113                              :   0.67
ILL_H13_IN20                          :   0.68
ILL_H142                              :   0.73
ILL_H142_D33                          :   NO TEST
ILL_H142_IN12                         :   0.69
ILL_H143_LADI                         :   NO TEST
ILL_H15                               :   0.67
ILL_H15_D11                           :   0.68
ILL_H15_IN6                           :   NO COMPILE
ILL_H16                               :   0.61
ILL_H16_IN5                           :   4.22
ILL_H16_IN5_2                         :   0.77
ILL_H16_IN5_Mantid                    :   4.15
ILL_H16_IN5_Mantid_2                  :   4.19
ILL_H16_Mantid                        :   NO TEST
ILL_H22                               :   0.72
ILL_H22_D1A                           :   0.70
ILL_H22_D1B                           :   0.75
ILL_H22_VIVALDI                       :   0.79
ILL_H24                               :   0.69
ILL_H25                               :   0.65
ILL_H25_IN22                          :   0.67
ILL_H5                                :   NO COMPILE
ILL_H512_D22                          :   NO COMPILE
ILL_H53                               :   0.65
ILL_H53_D16                           :   NO TEST
ILL_H53_IN14                          :   0.69
ILL_H8_IN1                            :   0.60
ILL_IN13                              :   NO COMPILE
ILL_IN4                               :   NO COMPILE
ILL_IN5                               :   0.73
ILL_IN5_Mantid                        :   4.09
ILL_IN6                               :   NO COMPILE
ILL_Lagrange                          :   NO COMPILE
ISIS_CRISP                            :   NO COMPILE
ISIS_GEM                              :   1.21
ISIS_HET                              :   1.19
ISIS_IMAT                             :   NO TEST
ISIS_MERLIN                           :   NO TEST
ISIS_OSIRIS                           :   NO COMPILE
ISIS_Prisma2                          :   NO COMPILE
ISIS_SANS2d                           :   NO TEST
ISIS_SANS2d_Mantid                    :   NO COMPILE
ISIS_TS1_Brilliance                   :   NO TEST
ISIS_TS2_Brilliance                   :   NO TEST
ISIS_test                             :   1.14
LLB_6T2                               :   NO TEST
MCPL2hist                             :   NO COMPILE
Mezei_SNS_decoupled_poisoned          :   NO COMPILE
PSI_DMC                               :   0.85
PSI_DMC_simple                        :   0.72
PSI_Focus                             :   0.84
PSI_source                            :   0.55
RITA-II                               :   NO COMPILE
RTP_DIF                               :   NO COMPILE
RTP_Laue                              :   NO COMPILE
RTP_NeutronRadiography                :   NO COMPILE
RTP_SANS                              :   NO COMPILE
Reflectometer                         :   NO COMPILE
SAFARI_MPISI                          :   NO TEST
SAFARI_PITSI                          :   NO TEST
SNS_ARCS                              :   NO COMPILE
SNS_BASIS                             :   NO COMPILE
SNS_analytic_test                     :   NO TEST
SNS_test                              :   NO COMPILE
Samples_Incoherent                    :   0.65
Samples_Incoherent_2                  :   0.58
Samples_Incoherent_3                  :   0.63
Samples_Incoherent_4                  :   0.57
Samples_Incoherent_5                  :   0.63
Samples_Incoherent_6                  :   0.58
Samples_Incoherent_7                  :   0.64
Samples_Incoherent_8                  :   0.58
Samples_Incoherent_9                  :   0.63
Samples_Incoherent_10                 :   0.57
Samples_Incoherent_off                :   0.53
Samples_Isotropic_Sqw                 :   7.35
Samples_Phonon                        :   NO COMPILE
Samples_vanadium                      :   0.58
TestSANS                              :   NO COMPILE
Test_Collimator_Radial                :   0.66
Test_Collimator_Radial_2              :   0.65
Test_Collimator_Radial_3              :   0.63
Test_FocalisationMirrors              :   0.62
Test_Jump_Iterate                     :   NO COMPILE
Test_Lens                             :   NO COMPILE
Test_MCPL_input                       :   NO COMPILE
Test_MCPL_output                      :   NO COMPILE
Test_Monitor_Sqw                      :   NO COMPILE
Test_Monitor_Sqw                      :   NO COMPILE
Test_Monochromators                   :   NO COMPILE
Test_Monochromators                   :   NO COMPILE
Test_Monochromators                   :   NO COMPILE
Test_Monochromators                   :   NO COMPILE
Test_Monochromators                   :   NO COMPILE
Test_PSD_Detector                     :   0.58
Test_Pol_Guide_Vmirror                :   NO COMPILE
Test_Pol_MSF                          :   NO COMPILE
Test_Pol_Mirror                       :   NO COMPILE
Test_Pol_SF_ideal                     :   NO TEST
Test_Pol_Set                          :   0.59
Test_Pol_TripleAxis                   :   NO COMPILE
Test_PowderN_Res                      :   NO TEST
Test_PreMonitor_nD                    :   NO COMPILE
Test_SSR_SSW                          :   NO COMPILE
Test_SSR_SSW_Guide                    :   NO COMPILE
Test_Sample_nxs_diffraction           :   NO COMPILE
Test_Sample_nxs_imaging               :   NO COMPILE
Test_Selectors                        :   0.57
Test_Selectors_2                      :   0.68
Test_Sources                          :   NO COMPILE
Test_Sources                          :   NO COMPILE
Test_Sources                          :   NO COMPILE
Test_Sources                          :   NO COMPILE
Test_Sources                          :   NO COMPILE
Test_Sources                          :   NO COMPILE
Test_Sources                          :   NO COMPILE
Test_Sources                          :   NO COMPILE
Test_StatisticalChopper               :   1.32
Tomography                            :   0.54
Union_IncoherentPhonon_test           :   NO COMPILE
Union_conditional_test                :   NO COMPILE
Union_conditional_test                :   NO COMPILE
Union_conditional_test                :   NO COMPILE
Union_demonstration                   :   NO COMPILE
Union_demonstration_absorption_image  :   NO COMPILE
Union_external_component              :   NO COMPILE
Union_external_component_test         :   NO COMPILE
Union_geometry_test                   :   NO COMPILE
Union_incoherent_validation           :   NO COMPILE
Union_laue_camera                     :   NO COMPILE
Union_logger_test                     :   NO COMPILE
Union_manual_example                  :   NO COMPILE
Union_powder_validation               :   NO COMPILE
Union_sample_picture_replica          :   NO COMPILE
Union_single_crystal_validation       :   NO COMPILE
Union_tagging_demo                    :   NO COMPILE
Union_test_absorption                 :   NO COMPILE
Union_test_absorption_image           :   NO COMPILE
Union_test_box                        :   NO COMPILE
Union_test_mask                       :   NO COMPILE
Union_test_powder                     :   NO COMPILE
Union_test_texture                    :   NO COMPILE
Union_time_of_flight                  :   NO COMPILE
Vin_test                              :   NO COMPILE
Vout_test                             :   NO TEST
h8_test_legacy                        :   NO TEST
linup-1                               :   0.56
linup-2                               :   0.61
linup-3                               :   0.56
linup-4                               :   0.64
linup-5                               :   0.56
linup-6                               :   0.65
linup-7                               :   0.56
micro                                 :   NO TEST
mini                                  :   0.61
nano                                  :   NO TEST
template                              :   NO TEST
templateDIFF                          :   0.69
templateLaue                          :   0.70
templateLaue_2                        :   0.72
templateLaueGPU                       :   NO COMPILE
templateLaueGPU_SXonly                :   NO COMPILE
templateLaueGPU_rngonly               :   NO COMPILE
templateNMX                           :   NO TEST
templateNMX_TOF                       :   NO TEST
templateSANS                          :   0.62
templateSANS_Mantid                   :   0.58
templateSasView                       :   NO COMPILE
templateSasView_Mantid                :   NO COMPILE
templateTAS                           :   0.73
templateTAS_2                         :   0.65
templateTOF                           :   NO COMPILE
templateTOF                           :   NO COMPILE
templateVanadiumMultipleScat_Mantid   :   NO COMPILE
template_simple                       :   NO TEST
willend commented 4 years ago

VERY hidden expression in cexp.c - see https://github.com/McStasMcXtrace/McCode/commit/cabe455a75e4e5b90af2a28bf98ae2146e456bbf

Will now run a test to see if how this works across the suite

willend commented 4 years ago

Further we found that enabling debugging using -g may have weird consequences in terms of both binary size and runtime behaviour... :-s