Closed willend closed 3 years ago
What is currently on the branch Enable-SPLIT-on-GPU-(cogen-changes) sort of works - with caveats.
Two examples of this are stored in subfolders SPLIT2 and SPLIT3 with different types of issues:
SPLIT2 - modified PSI_DMC.instr: There are specific issues related to Beamstop / ALLOW_BACKPROP. Putting an Arm instead and optionally including an if(!SCATTERED) ABSORB;
makes the instrument run.
SPLIT3 - minimalistic split instrument. Currently, the only way to use an input parameter for SPLIT goes via the DECLARE / INITIALIZE blocks with glued pragmas. The reason is that somehow the standard generated code of int SplitS_coll2 = instrument->_parameters.dummy;
does not "resolve" on GPU, whereas hacking in _instrument_var._parameters.dummy
makes the instrument run. @climbcat I can't seem to figure out where the right-hand side of that assignment is generated, any hints?
@climbcat,@farhi I experimented with an edit on line 322 in cogen.c.in, which seems to have no effect on the above, any other good ideas are very welcome
index 7488096ef..3f2afc8de 100644
--- a/mcstas/src/cogen.c.in
+++ b/mcstas/src/cogen.c.in
@@ -322,7 +322,7 @@ static void cogen_defundef(struct comp_inst *comp, List l, char define_it)
} /* for List */
}
if (flag_noconflict && strlen(c_formal->id))
- coutf(" #define %s (instrument->_parameters.%s)",
+ coutf(" #define %s (_instrument_var._parameters.%s)",
c_formal->id, c_formal->id);
break;
case PAR_UNDEF:
Hi @willend, is the instr par used to read the split count/size? (I.e. the number that defaults to 10.)
EDIT: Ok, no this is comp->split
...
The code you wanted to find is located in the "cogen_comp_init_par" function, around line 534.
Having a look at the SPLIT2 and SPLIT3 experiments...
Hmm, I am not sure it is around 534? I see no ->
assignments there?
The problematic (crashing SPLIT3/doesntwork) is
int SplitS_coll2 = instrument->_parameters.dummy;
i.e. the use of the instrument pointer with the dummy input parameter. What works is if I hack the code and write the below reference to the _instrument_var instead
int SplitS_coll2 = _instrument_var._parameters.dummy
(Or as shown in SPLIT3 top level hack for DECLARE parm initialised with extra pragmas in the instr file)
Related to what you wrote int 1652bde6c9f61d824413a8c63dce79d63a8006a7 ...
A while back you changed the acc_attach( (void*)&_instrument_var );
to use the instrument var rather than the pointer. This seems to be a side effect of that...
Does everything work perfectly if we hard-code all split counts to be 10 for the moment?
A hard-coded 10 seems to work in most cases indeed. (PSI_DMC has something to do with code in the Beamstop comp)
So you are suggesting an acc_attach( instrument);
could work? I will try that ASAP.
Update: The above acc_attach(instrument);
seems to make no difference.
It would be really nice if we could ditch the use of instrument->_parameters.var and simply use _instrument_var._parameters.var, which does work. Just can't myself currently figure out where that code is written / generated... :-)
Ok, wonder what shows up in a search of instrument->parameters in cogen.c.in and mccode-r* ?
wrt. the above grep not a lot:
cogen.c.in:325: coutf(" #define %s (instrument->_parameters.%s)",
(as indicated above. Changing that line seems to have no effect on the said issue.mccode-r.h.in:349:#define INSTRUMENT_GETPAR(par) (instrument->_parameters.par)
(which btw can not be the issue as that is a macro / define)Just ran a test from the said branch, so far it looks like these SPLIT-instrument work (compile):
once/McStas_GPU_PGCC_TESLA_KISS/HZB_NEAT/HZB_NEAT.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_D2B/ILL_D2B.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_D4/ILL_D4.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_H142_D33/ILL_H142_D33.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_H143_LADI/ILL_H143_LADI.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_H22_D1A/ILL_H22_D1A.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_H22_D1B/ILL_H22_D1B.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_H22_VIVALDI/ILL_H22_VIVALDI.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_H53_D16/ILL_H53_D16.out
once/McStas_GPU_PGCC_TESLA_KISS/ILL_IN5_Mantid/ILL_IN5_Mantid.out
once/McStas_GPU_PGCC_TESLA_KISS/ISIS_GEM/ISIS_GEM.out
once/McStas_GPU_PGCC_TESLA_KISS/LLB_6T2/LLB_6T2.out
once/McStas_GPU_PGCC_TESLA_KISS/PSI_DMC/PSI_DMC.out
once/McStas_GPU_PGCC_TESLA_KISS/PSI_Focus/PSI_Focus.out
once/McStas_GPU_PGCC_TESLA_KISS/SAFARI_MPISI/SAFARI_MPISI.out
once/McStas_GPU_PGCC_TESLA_KISS/SAFARI_PITSI/SAFARI_PITSI.out
once/McStas_GPU_PGCC_TESLA_KISS/Test_PowderN_Res/Test_PowderN_Res.out
once/McStas_GPU_PGCC_TESLA_KISS/Test_StatisticalChopper/Test_StatisticalChopper.out
once/McStas_GPU_PGCC_TESLA_KISS/h8_test_legacy/h8_test_legacy.out
once/McStas_GPU_PGCC_TESLA_KISS/linup-4/linup-4.out
once/McStas_GPU_PGCC_TESLA_KISS/linup-5/linup-5.out
once/McStas_GPU_PGCC_TESLA_KISS/linup-6/linup-6.out
once/McStas_GPU_PGCC_TESLA_KISS/linup-7/linup-7.out
once/McStas_GPU_PGCC_TESLA_KISS/templateDIFF/templateDIFF.out
once/McStas_GPU_PGCC_TESLA_KISS/templateNMX/templateNMX.out
once/McStas_GPU_PGCC_TESLA_KISS/templateNMX_TOF/templateNMX_TOF.out
once/McStas_GPU_PGCC_TESLA_KISS/templateSANS/templateSANS.out
once/McStas_GPU_PGCC_TESLA_KISS/templateSANS_Mantid/templateSANS_Mantid.out
once/McStas_GPU_PGCC_TESLA_KISS/templateTAS/templateTAS.out
And these don't:
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/FZJ_BenchmarkSfin2/FZJ_BenchmarkSfin2.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/HZB_NEAT/2/HZB_NEAT.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ILL_BRISP/ILL_BRISP.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ILL_H15_IN6/ILL_H15_IN6.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ILL_H512_D22/ILL_H512_D22.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ILL_IN13/ILL_IN13.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ILL_IN4/ILL_IN4.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ILL_IN6/ILL_IN6.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ILL_Lagrange/ILL_Lagrange.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/ISIS_SANS2d_Mantid/ISIS_SANS2d_Mantid.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/RITA-II/RITA-II.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/RTP_DIF/RTP_DIF.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/RTP_Laue/RTP_Laue.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/RTP_SANS/RTP_SANS.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/SNS_BASIS/SNS_BASIS.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/Test_PreMonitor_nD/Test_PreMonitor_nD.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/Union_test_texture/Union_test_texture.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/templateSasView/templateSasView.out: No such file or directory
ls: cannot access once/McStas_GPU_PGCC_TESLA_KISS/templateSasView_Mantid/templateSasView_Mantid.out: No such file or directory```
- of course the failing ones can easily be of completely different reason, an example is PSI_DMC where the Beamstop seems to be the cause.
In total we have 88 compiles (```find once -name \*.out | wc -l
88```)- just like on new-nightly, will have a look at which end op producing data and which don't.
Test output in terms of data
Running tests...
BNL_H8 : 0.87
BNL_H8_simple : 0.57
BTsimple : NO COMPILE
Demo_shape_primitives : NO TEST
ESS_2001_bispectral : NO COMPILE
ESS_2015_test : NO COMPILE
ESS_Brilliance_2001 : NO COMPILE
ESS_Brilliance_2013 : NO COMPILE
ESS_Brilliance_2014 : NO COMPILE
ESS_Brilliance_2015 : NO COMPILE
ESS_Brilliance_TDR : NO COMPILE
ESS_IN5_reprate : 0.65
ESS_Testbeamline_HZB_V20 : NO COMPILE
ESS_butterfly_Guide_curved_test : NO COMPILE
ESS_butterfly_MCPL_test : NO COMPILE
ESS_butterfly_test : NO TEST
ESS_butterfly_tfocus_NOFOCUS_test : NO COMPILE
ESS_butterfly_tfocus_test : NO COMPILE
ESS_mcpl2hist : NO COMPILE
FZJ_BenchmarkSfin2 : NO COMPILE
FZJ_KWS2_Lens : 0.59
FZJ_SANS_KWS2_AnySample : 0.70
FZJ_SANS_KWS2_AnySample_2 : 0.59
FZJ_SANS_KWS2_AnySample_3 : 0.68
FZJ_SANS_KWS2_AnySample_4 : 0.59
Gallmeier_SNS_decoupled_poisoned : NO TEST
Granroth_SNS_decoupled_poisoned : NO COMPILE
HZB_FLEX : NO TEST
HZB_NEAT : 7.79
HZB_NEAT_2 : 1.22
Histogrammer : NO COMPILE
ILL_BRISP : NO COMPILE
ILL_D2B : 0.73
ILL_D4 : 0.64
ILL_H10_IN8 : 0.68
ILL_H113 : 0.67
ILL_H13_IN20 : 0.68
ILL_H142 : 0.73
ILL_H142_D33 : NO TEST
ILL_H142_IN12 : 0.69
ILL_H143_LADI : NO TEST
ILL_H15 : 0.67
ILL_H15_D11 : 0.68
ILL_H15_IN6 : NO COMPILE
ILL_H16 : 0.61
ILL_H16_IN5 : 4.22
ILL_H16_IN5_2 : 0.77
ILL_H16_IN5_Mantid : 4.15
ILL_H16_IN5_Mantid_2 : 4.19
ILL_H16_Mantid : NO TEST
ILL_H22 : 0.72
ILL_H22_D1A : 0.70
ILL_H22_D1B : 0.75
ILL_H22_VIVALDI : 0.79
ILL_H24 : 0.69
ILL_H25 : 0.65
ILL_H25_IN22 : 0.67
ILL_H5 : NO COMPILE
ILL_H512_D22 : NO COMPILE
ILL_H53 : 0.65
ILL_H53_D16 : NO TEST
ILL_H53_IN14 : 0.69
ILL_H8_IN1 : 0.60
ILL_IN13 : NO COMPILE
ILL_IN4 : NO COMPILE
ILL_IN5 : 0.73
ILL_IN5_Mantid : 4.09
ILL_IN6 : NO COMPILE
ILL_Lagrange : NO COMPILE
ISIS_CRISP : NO COMPILE
ISIS_GEM : 1.21
ISIS_HET : 1.19
ISIS_IMAT : NO TEST
ISIS_MERLIN : NO TEST
ISIS_OSIRIS : NO COMPILE
ISIS_Prisma2 : NO COMPILE
ISIS_SANS2d : NO TEST
ISIS_SANS2d_Mantid : NO COMPILE
ISIS_TS1_Brilliance : NO TEST
ISIS_TS2_Brilliance : NO TEST
ISIS_test : 1.14
LLB_6T2 : NO TEST
MCPL2hist : NO COMPILE
Mezei_SNS_decoupled_poisoned : NO COMPILE
PSI_DMC : 0.85
PSI_DMC_simple : 0.72
PSI_Focus : 0.84
PSI_source : 0.55
RITA-II : NO COMPILE
RTP_DIF : NO COMPILE
RTP_Laue : NO COMPILE
RTP_NeutronRadiography : NO COMPILE
RTP_SANS : NO COMPILE
Reflectometer : NO COMPILE
SAFARI_MPISI : NO TEST
SAFARI_PITSI : NO TEST
SNS_ARCS : NO COMPILE
SNS_BASIS : NO COMPILE
SNS_analytic_test : NO TEST
SNS_test : NO COMPILE
Samples_Incoherent : 0.65
Samples_Incoherent_2 : 0.58
Samples_Incoherent_3 : 0.63
Samples_Incoherent_4 : 0.57
Samples_Incoherent_5 : 0.63
Samples_Incoherent_6 : 0.58
Samples_Incoherent_7 : 0.64
Samples_Incoherent_8 : 0.58
Samples_Incoherent_9 : 0.63
Samples_Incoherent_10 : 0.57
Samples_Incoherent_off : 0.53
Samples_Isotropic_Sqw : 7.35
Samples_Phonon : NO COMPILE
Samples_vanadium : 0.58
TestSANS : NO COMPILE
Test_Collimator_Radial : 0.66
Test_Collimator_Radial_2 : 0.65
Test_Collimator_Radial_3 : 0.63
Test_FocalisationMirrors : 0.62
Test_Jump_Iterate : NO COMPILE
Test_Lens : NO COMPILE
Test_MCPL_input : NO COMPILE
Test_MCPL_output : NO COMPILE
Test_Monitor_Sqw : NO COMPILE
Test_Monitor_Sqw : NO COMPILE
Test_Monochromators : NO COMPILE
Test_Monochromators : NO COMPILE
Test_Monochromators : NO COMPILE
Test_Monochromators : NO COMPILE
Test_Monochromators : NO COMPILE
Test_PSD_Detector : 0.58
Test_Pol_Guide_Vmirror : NO COMPILE
Test_Pol_MSF : NO COMPILE
Test_Pol_Mirror : NO COMPILE
Test_Pol_SF_ideal : NO TEST
Test_Pol_Set : 0.59
Test_Pol_TripleAxis : NO COMPILE
Test_PowderN_Res : NO TEST
Test_PreMonitor_nD : NO COMPILE
Test_SSR_SSW : NO COMPILE
Test_SSR_SSW_Guide : NO COMPILE
Test_Sample_nxs_diffraction : NO COMPILE
Test_Sample_nxs_imaging : NO COMPILE
Test_Selectors : 0.57
Test_Selectors_2 : 0.68
Test_Sources : NO COMPILE
Test_Sources : NO COMPILE
Test_Sources : NO COMPILE
Test_Sources : NO COMPILE
Test_Sources : NO COMPILE
Test_Sources : NO COMPILE
Test_Sources : NO COMPILE
Test_Sources : NO COMPILE
Test_StatisticalChopper : 1.32
Tomography : 0.54
Union_IncoherentPhonon_test : NO COMPILE
Union_conditional_test : NO COMPILE
Union_conditional_test : NO COMPILE
Union_conditional_test : NO COMPILE
Union_demonstration : NO COMPILE
Union_demonstration_absorption_image : NO COMPILE
Union_external_component : NO COMPILE
Union_external_component_test : NO COMPILE
Union_geometry_test : NO COMPILE
Union_incoherent_validation : NO COMPILE
Union_laue_camera : NO COMPILE
Union_logger_test : NO COMPILE
Union_manual_example : NO COMPILE
Union_powder_validation : NO COMPILE
Union_sample_picture_replica : NO COMPILE
Union_single_crystal_validation : NO COMPILE
Union_tagging_demo : NO COMPILE
Union_test_absorption : NO COMPILE
Union_test_absorption_image : NO COMPILE
Union_test_box : NO COMPILE
Union_test_mask : NO COMPILE
Union_test_powder : NO COMPILE
Union_test_texture : NO COMPILE
Union_time_of_flight : NO COMPILE
Vin_test : NO COMPILE
Vout_test : NO TEST
h8_test_legacy : NO TEST
linup-1 : 0.56
linup-2 : 0.61
linup-3 : 0.56
linup-4 : 0.64
linup-5 : 0.56
linup-6 : 0.65
linup-7 : 0.56
micro : NO TEST
mini : 0.61
nano : NO TEST
template : NO TEST
templateDIFF : 0.69
templateLaue : 0.70
templateLaue_2 : 0.72
templateLaueGPU : NO COMPILE
templateLaueGPU_SXonly : NO COMPILE
templateLaueGPU_rngonly : NO COMPILE
templateNMX : NO TEST
templateNMX_TOF : NO TEST
templateSANS : 0.62
templateSANS_Mantid : 0.58
templateSasView : NO COMPILE
templateSasView_Mantid : NO COMPILE
templateTAS : 0.73
templateTAS_2 : 0.65
templateTOF : NO COMPILE
templateTOF : NO COMPILE
templateVanadiumMultipleScat_Mantid : NO COMPILE
template_simple : NO TEST
VERY hidden expression in cexp.c - see https://github.com/McStasMcXtrace/McCode/commit/cabe455a75e4e5b90af2a28bf98ae2146e456bbf
Will now run a test to see if how this works across the suite
Further we found that enabling debugging using -g may have weird consequences in terms of both binary size and runtime behaviour... :-s
Subfolder SPLIT has semi-functional SPLIT provided as a hack - but looks like it is implemented in a semi non-vectorised way. (Symptoms intensity and stats are off the expected value due to "re-use" of the incoming particle.)
We should likely instead