Closed jipolanco closed 2 years ago
Merging #40 (30474ed) into master (aaa806b) will increase coverage by
0.02%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #40 +/- ##
==========================================
+ Coverage 97.15% 97.17% +0.02%
==========================================
Files 17 18 +1
Lines 983 1026 +43
==========================================
+ Hits 955 997 +42
- Misses 28 29 +1
Impacted Files | Coverage Δ | |
---|---|---|
src/Transpositions/Transpositions.jl | 98.09% <100.00%> (+0.30%) |
:arrow_up: |
src/gather.jl | 100.00% <100.00%> (ø) |
|
src/random.jl | 100.00% <100.00%> (ø) |
|
src/arrays.jl | 95.14% <0.00%> (-0.98%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update aaa806b...30474ed. Read the comment docs.
For GPU arrays, transpositions and other operations are now performed completely on the GPU (as far as I can tell...), avoiding slow scalar indexing.
Well, for now this has just been tested with the reference implementation of GPUArrays.jl (
JLArray
), which is implemented on CPUs.It would be nice to test things with
CuArray
s. For that, one just needs to addCuArray
to the list of array types tested intest/array_types.jl
. @corentin-dev let me know if you can try that out.For now I have no idea how the transposition of GPU arrays actually performs, and it would be nice to have some benchmarks. There are still some things that can be improved. In particular, when using dimension permutations (enabled by default in PencilFFTs), there are some additional allocations that should be taken care of.
This PR closes #21 (but can be reopened if stuff is missing).