High performance Amplified Spontaneous Emission on GPU
HASEonGPU is a scientific project. If you present and/or publish scientific results that used HASEonGPU, you should set this as a reference.
HASEonGPU is licensed under the GPLv3+. Please refer to our LICENSE.md
Software:
Optional:
Hardware:
git clone https://github.com/computationalradiationphysics/haseongpu.git
mkdir haseongpu/build
cd haseongpu/build
cmake ..
make
A small example for Phi ASE calculation with a pumped crystal. The simulation can be started by the following:
matlab laserPumpCladdingExample
./bin/calcPhiASE --input-path=./input/cylindrical --min-rays=10000
[phiASE, MSE, nRays] = calcPhiASE(
points,
trianglePointIndices,
betaCells,
betaVolume,
claddingCellTypes,
claddingNumber,
claddingAbsorption,
useReflections,
refractiveIndices,
reflectivities,
triangleNormalsX,
triangleNormalsY,
triangleNeighbors,
triangleSurfaces,
triangleCenterX,
triangleCenterY,
triangleNormalPoint,
forbiddenEdge,
minRaysPerSample,
maxRaysPerSample,
mseThreshold,
repetitions,
nTot,
thickness,
laserParameter,
crystal,
numberOfLevels,
deviceMode,
parallelMode,
maxGPUs,
nPerNode
);
The returned values are represented as two-dimensional matrices in which columns are slice indices(levels) and rows are point indices. The value for the ith point and jth slice can then be optained by MATLAB with:
value = values(i,j);
In the following all arguments of the MATLAB call are described. You will find on each point a head with datatype (an array when in brackets []), the size of the array and to which set of numbers the array belongs.
points [float], in {0, ..., numberOfPoints}, size = numberOfPoints
The coordinates of the triangle vertices. All x coordinates followed by all
of the y coordinates of the triangle vertices
structure: [x_1, x_2, ... x_n, y_1, y_2, ... y_n] (n == numberOfPoints)
trianglePointIndices [int] in {0, ..., numberOfPoints}, size = numberOfTriangles * 3
Contains the indices to access the "points" datastructure
(each triangle has 3 points as vertices). Each entry is an
index from 0 to numberOfPoints, corresponding to the positions
of a vertex in "points".
Structure as follows:
[ triangle1A, triangle2A, ... triangleNA, triangle1B, triangle2B, ... triangleNB, triangle1C, ... ]
i.e. for triangles with vertices A,B,C there are all the indices
of the A-vertices, followed by all the B and C vertices.
betaCells [float]
Stimulus in the sample points.
betaVolume [float], size = numberOfTriangles numberOfLevels - 1
Stimulus in the volume (prisms).
Beta values for all prisms ordered accordingly to the prismIDs:
prismID = triangleID + layer numberOfTriangles.
Therefore, all betaValues for a layer are grouped together
claddingCellTypes [int], size = numberOfTriangles
Sets cladding index for triangles {0,1,2,...}
claddingNumber unsigned, size = 1
Set which cladding to use
claddingAbsorption float, size = 1
Absorption coefficient of cladding
useReflections bool, size = 1
Switch to activate reflections.
refractiveIndices [float], size = 4
Describes the refractive indices of the active
gain medium top and bottom planes.
It is structured as follows:
{bottomInside, bottomOutside, topInside, topOutside}
bottomInside = topInside (because it is the same medium)
reflectivities [float], in {0, ...,1}, size = 2 * numberOfTriangles
Defines the reflectivities of prism planes.
First the reflectivities of bottom plane and then the reflectivities
of top plane. Both it ordered by TriangleID.
triangleNormalsX [float], size = numberOfTriangles * 3
The x coordinate of the normal vectors for each triangle edge.
It is ordered as follows:
[ triangle1_edge0, triangle2_edge0, ... triangleN_edge0, triangle1_edge1, triangle2_edge1, ... ]
i.e. all first edges of each triangle, followed by all second edges of each triangle and so on.
triangleNormalsY [float], size = numberOfTriangles * 3
The y coordinate of the normal vectors for each triangle edge.
It is ordered as follows:
[ triangle1_edge0, triangle2_edge0, ... triangleN_edge0, triangle1_edge1, triangle2_edge1, ... ]
i.e. all first edges of each triangle, followed by all second edges of each triangle and so on.
triangleNeighbors [int], in {-1,0,1,2,4}, size = 5
Describes the neighnor relation of triangles to each other.
Each entry corresponds to a triangleID (see "triangles") which
is adjacent to the current triangle and edge.
Structure is similar to "forbidden":
[ triangle1_edge0, triangle2_edge0, ... triangleN_edge0, triangle1_edge1, triangle2_edge1, ... ]
triangleSurfaces [float], size = numberOfTriangles
The sizes of the surfaces of each triangle, ordered by the triangleID.
triangleCenterX [float], size = numberOfTriangles
The x coordinates of the center points for each triangle
ordered by TriangleID.
triangleCenterY [float], size = numberOfTriangles
The y coordinates of the center points for each triangle
ordered by TriangleID.
triangleNormalPoint [unsigned], in {0, ..., numberOfPoints}, size = numberOfTriangles * 3
Contains indices to the point where the
triangleNormalVectors start. For each Triangle 3 points (3 edges)
are stored in this list. Indices point to locations in "points"
(i.e. normal vectors start at triangle vertices!)l
Structure is VERY similar to triangles:
[ triangle1_p0, triangle2_p0, ... triangleN_p0, triangle1_p1, triangle2_p1, ... ]
forbiddenEdge [int], in {-1,0,1,2,4}, size = 5
Describes the relation of edge indices of adjacent triangles
-1 means, there is no adjacent triangle to that edge
0,1,2 describes the index of the edge as seen from the ADJACENT triangle
Order of data is similar to normalVec:
[ triangle1_edge0, triangle2_edge0, ... triangleN_edge0, triangle1_edge1, triangle2_edge1, ... ]
i.e. all first edges of each triangle, followed by all second edges of each triangle and so on.
minRaysPerSample unsigned, size = 1
Minimal number of rays for adaptive sampling
maxRaysPerSample unsigned, size = 1
Maximal number of rays for adaptive sampling
mseThreshold float, size = 1
Sets the maximal MSE of the ASE value.
If a sample-point does not reach this MSE-threshold, the number
of rays per sample-point will be increased upto maxRaysPerSample or
resampled with repetitive sampling.
repetitions unsigned, size = 1
Sets the number of maximal repetitions when the
mseThreshold was not reached.
nTot float, size = 1
Doping of the active gain medium
thickness float, size = 1
Thickness of one prism level of the mesh.
laserParameter [float]
Is a structure for the laser parameters (intensities sigma, wavelength lambda)
s_ems corresponds to l_ems and s_abs to l_abs
struct(s_abs, VALUES, s_ems, VALUES, l_abs, VALUES, l_ems, VALUES)
crystal [float]
Is a structure for the crystal parameters
crystal.tfluo describes the crystalFluorescence of the active gain medium.
numberOfLevels unsigned, size = 1
Total number of levels of the mesh. Thus the total thickness
of the mesh is thickness * numberOfLevels!
parallelMode
deviceMode
maxGpus unsigned, size = 1
Maximal number of GPUs for threaded case
nPerNode
Number of devices per mpi-node
Command:
./bin/calcPhiASE [OPTIONS]
Options:
--input-path Path to the experiment location. This folder contains several .txt files usually generated by an matlab script. The content of this .txt files contains all experiment data you need to run one experiment.
--output-path Path to a writable location. Is used to write input and output for matlab script.
--parallel-mode=[|threaded|mpi]
Defines the method of parallelization to start the
simulation with. Mode "threaded" uses pthreads on a single
node. Mode "mpi" is a parallel mpi
implementation for clusters. Note, that this parameter
is currently only available when using --device-mode=gpu
--device-mode=[cpu|gpu]
Defines on which hardware the simulation will run.
Mode "cpu" is the original
algorithm based on single core cpu.
Mode "gpu" uses nVIDIA CUDA GPUs, that can be parallelized either
with Pthreads or MPI.
--min-rays=
Sets the minimum number of rays per sample-point in the
crystal structure.
--max-rays=
Sets the maximal number of rays per sample-point. The number
of rays per sample-point will vary between minimum and
maximum number of rays in dependance of a MSE-Threshold.
(see --mse-threshold)
--ngpus=
Set the number of gpus to use. "mpi" parallel-mode should set this
to 1 and "threaded" to the maximum number
of GPUs on the node. If you don't set it, it will
be set to the maximum automatically.
--min-sample-i=
Index of the first sample point (normally 0).
--max-sample-i=
Index of the last sample point (numberOfSample - 1).
--verbosity=
Add the following for different verbosity levels:
0 : quiet
1 : error
2 : warning
4 : info
8 : statistics
16 : debug
32 : progress-bar
Levels the verbosity level is interpreted as a bitmask and
can be composed of different levels.
--reflection
Use reflection on upper and lower plane of gain
medium. Maximal number of reflections will be
calculated
--mse-threshold=
Algorithm tries to stay under this threshold
by adaptive and repetitive sampling.
--repetitions=
Number of repetitions, that will be done
when mse-threshold was not met.
--spectral-resolution= Resolution of absorption and emission spectrum to which the input spectrum will be interpolated linear. Interpolation is used to distribute spectrum values equidistant over the wavelength. Omitting this option or setting a to small resolutionwill set the lambda resolution to the maximum number of absorption or emission values.
4 GPUs, 10K to 100K Rays, 4 Repetitions
./bin/calcPhiASE --input-path=/input/
--output-path=/tmp/
--parallel-mode=threaded
--min-rays=10000
--max-rays=100000
--reflection
--repetitions=4
--ngpus=4
--min-sample-i=0
--max-sample-i=1234
--mse-threshold=0.05
MPI with 4 GPUs per node
mpiexec -npernode 4 ./bin/calcPhiASE --input-path=/input/
--output-path=/tmp/
--parallel-mode=mpi
--min-rays=10000
--max-rays=100000
--reflection
--repetitions=4
--ngpus=1
--min-sample-i=0
--max-sample-i=1234
--mse-threshold=0.05
CMakeLists.txt
cmake file to generate a Makefilesrc/
folder containing all the source code that is not a header
src/map_rays_to_prisms.cu
CUDA code to generate a schedule of which ray will be launched from which prism of the gain mediumsrc/calcPhiASE.m
MATLAB adapter script src/logging.cu
creates nicely readable output based on log-levelssrc/importance_sampling.cu
CUDA parallelized importance samplingsrc/geometry.cu
basic 3D geometry calculationssrc/interpolation.cu
interpolation functions for wavelengths of polychromatic laser pulsessrc/reflection.cu
CUDA functions to calculate reflections inside the gain mediumsrc/progressbar.cu
progressbar for the command linesrc/parser.cu
parsing of command line arguments and textfilessrc/for_loops_clad.cu
old CPU code for ASE calculationsrc/calc_phi_ase_mpi.cc
MPI workload distribution. Code for Master and Slavessrc/write_to_file.cu
writing formatted data to a filesrc/ray_histogram.cu
print a histogram of the adaptive ray count to command linesrc/calc_phi_ase_threaded.cu
pthreads workload distributionsrc/mt19937ar.cu
CPU code for Mersenne Twister PRNG used by for_loops_clad.cusrc/write_to_vtk.cu
generate VTK-files from simulation results (deprecated)src/propagate_ray.cu
CUDA code to propagate a single ray through the prism mesh structuresrc/mesh.cu
class that holds the information and all parameters about the gain medium meshsrc/cuda_utils.cu
utility functions (getFreeDevice)src/calc_sample_gain_sum.cu
CUDA code to calculate all the rays for a single sample pointsrc/calc_phi_ase.cu
CUDA code to calculate ASE for all the sample points src/main.cu
main entry filesrc/write_matlab_output.cu
generate MATLAB-readable matrixes from the simulation dataexample/
folder that contains executable examples
example/c_example/
folder that contains the commandline-exampleexample/c_example/input/
folder that contains input for 2 different experimentsexample/c_example/input/cylindrical/
folder that contains data for the cylindrical gain medium. For details on the files, see detailed information above (Input argument description)example/c_example/input/cuboid/
example input with a cuboid gain medium. contents similar to cylindrical example.example/c_example/output/
folder to gather the outputexample/matlab_example/
folder that contains the input data for the matlab exampleexample/matlab_example/lambda_e.txt
emission wavelengthsexample/matlab_example/sigma_e.txt
emission crosssectionexample/matlab_example/pt.mat
sampling points and delaynay-triangles of the gain mediumexample/matlab_example/set_variables.m
generate information about the meshexample/matlab_example/vtk_wedge.m
generate a VTK file from the meshexample/matlab_example/laserPumpCladdingExample.m
experimental setup. Run this file to see the progress of a whole experimentexample/matlab_example/sve.mat
example/matlab_example/sigma_a.txt
absorption crosssectionexample/matlab_example/gain.m
calculate gain distribution inside the gain mediumexample/matlab_example/beta_int3.m
utility function to calculate gain distributionexample/matlab_example/extract_gain_map.m
calculate the gain for the sample point used in the actual measurementexample/matlab_example/beta_int.m
utility function to calculate gain distribution example/matlab_example/lambda_a.txt
absorption wavelengthsinclude/
folder containing all the header source code
include/calc_phi_ase_mpi.hpp
header for calc_phi_ase_mpi.cuinclude/mesh.hpp
header for mesh.cuinclude/importance_sampling.hpp
header for importance_sampling.cuinclude/ray_histogram.hpp
header for ray_histogram.cuinclude/for_loops_clad.hpp
header for for_loops_clad.cuinclude/mt19937ar.hpp
header for mt19937ar.cuinclude/calc_phi_ase.hpp
header for calc_phi_ase.cuinclude/write_matlab_output.hpp
header for write_matlab_output.cuinclude/cuda_utils.hpp
header for cuda_utils.cuinclude/logging.hpp
header for logging.cuinclude/cudachecks.hpp
Macros to check the success state of CUDA callsinclude/reflection.hpp
header for reflection.cuinclude/parser.hpp
header for parser.cuinclude/map_rays_to_prisms.hpp
header for map_rays_to_prisms.cuinclude/calc_phi_ase_threaded.hpp
header for calc_phi_ase_threaded.cuinclude/thrust_device_vector_nowarn.hpp
wrapper to switch off compiler warning that is produced by 3rd party library (CUDA Thrust)include/propagate_ray.hpp
header for propagate_ray.cuinclude/thrust_host_vector_nowarn.hpp
wrapper to switch off compiler warning that is produced by 3rd party library (CUDA Thrust)include/calc_sample_gain_sum.hpp
header for calc_sample_gain_sum.cuinclude/interpolation.hpp
header for interpolation.cuinclude/version.hpp
version information for HASEonGPUinclude/geometry.hpp
header for geometry.cuinclude/write_to_file.hpp
header for write_to_file.cuinclude/types.hpp
type definitions for HASEonGPUinclude/progressbar.hpp
header for progressbar.cuinclude/nan_fix.hpp
wrapper to allow usage of isnan()
in a templateinclude/write_to_vtk.hpp
header for write_to_vtk.cuLICENSE.md
additional licensing informationREADME.md
this README fileREFERENCE.md
Referencing informationCOPYING
Full License informationutils/
folder that contains utility files
utils/cmake/
utils/cmake/modules/
3rd Party CMAKE module that was modified to circumvent a bug where the NVCC linker would crash on unknown arguments