Open arsonwong opened 8 months ago
sorry for my lack of input on this. I was wondering, for the parallelisation, what do you think the best way to incorporate this would be? I guess there are now three options: no parallelisation (i.e. just the old implementation), GPU parallelisation, or CPU parallelisation. Then there's the detailed vs. non-detailed mode. The user should be able to choose which one they want to use, but it would be better not to have three different files tmm_core_vec files, since most of the content is the same anyway.
s and p polarization, 10000 wavelength x angles, 6 layers, calculate coh_tmm: before speed increase: 4.178s after speed increase: 2.399s after speed increase, non-detailed mode: 1.948s CPU parallelization with 24 cores CPU parallelization: 0.844s CPU parallelization, non-detailed mode: 0.719s GPU parallelization with NVIDIA GeForce RTX 4060 GPU parallelization: 0.296s GPU parallelization, non-detailed mode: 0.118s