-
Alex and Victoria have shown that there are performance benefits to be had by using asynchronous kernel launches, even if the kernels themselves have to be run in order. This is also bourne out by thi…
-
Click to expand!
### Issue Type
Bug
### Have you reproduced the bug with TF nightly?
No
### Source
source
### Tensorflow Version
2.12.0
### Custom Code
Yes
### OS Platform and Distributi…
-
We have some code from a manual GPU port using OpenACC and are trying to replicate its performance using PSyclone.
If the code is formatted using PSyclone before the port then it is simple to take …
-
After setting the CUDA_VISIBLE_DEVICES environment variable #22 I was able to launch CM1 on multiple nodes of Gust. Unfortunately, it subsequently dies with the following error before the timestep is…
-
- https://github.com/mrnorman/miniWeather
- https://github.com/UoB-HPC/openmp-tutorial
- a direct N-body simulation case from Intel or Nvidia
- https://github.com/csc-training/openacc/blob/master/c…
-
If we use NMODL and NVHPC to translate and compile mechanisms for GPU execution, the resulting `special-core` crashes when run on a machine without a GPU when `--gpu` is not passed.
The expected be…
-
Dear Devito team,
I am trying to run the tutorial notebook `/devito/examples/seismic/tutorials/03_fwi.ipynb` on my machine. It runs normally when using the CPU, but I would like to use the GPU. I t…
yymss updated
4 months ago
-
This is a compilation of the outstanding issues and unimplemented features in the OpenACC enter data and update directives implementation in #310 and #1554.
1) Distinguishing between array accesses…
-
I followed the instructions here for building Flang:
`https://github.com/flang-compiler/flang/wiki/Building-Flang`
particular the OpenMP 4.5 version with the NVidia backend. However, if compile a …
-
see #70