Closed junghans closed 5 years ago
Merging #74 into master will not change coverage. The diff coverage is
n/a
.
@@ Coverage Diff @@
## master #74 +/- ##
======================================
Coverage 69.1% 69.1%
======================================
Files 26 26
Lines 1831 1831
======================================
Hits 1267 1267
Misses 564 564
Flag | Coverage Δ | |
---|---|---|
#clang | 84.7% <ø> (ø) |
:arrow_up: |
#doxygen | 19.8% <ø> (ø) |
:arrow_up: |
#gcc | 97.4% <ø> (ø) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 4cf3902...ac48bcd. Read the comment docs.
Ready to merge @dalg24 please review!
@jgalarowicz, it works, but sometimes the tests fails with:
24/30 Test #24: Core_tutorial_04 .................***Exception: Child aborted 1.61 sec
terminate called after throwing an instance of 'std::runtime_error'
what(): cudaDeviceSynchronize() error( cudaErrorDevicesUnavailable): all CUDA-capable devices are busy or unavailable /root/hpc-gitlab-runner/ecpcitest/ecp-copa/cabana/builds/users/junghans/2e8de492/1/ecpcitest/ecp-copa/cabana/kokkos/core/src/Cuda/Kokkos_Cuda_Impl.cpp:119
Traceback functionality not available
which means we should run the test as batch job instead.
What tags:
do need to use to submit this to the queue?
@junghans I believe the tag to submit a batch job to run on the compute nodes is "batch".
@jgalarowicz can you have a look why the last stage (test
) is failing?
@junghans Yes, I will take a look!
Thanks, there is just no error message, which confuses me!
@junghans For some reason I can't log into ORNL. I opened a ticket. I will try this when I can login again.
@junghans My account at ORNL has been disabled. I think because my INCITE PEAC allocation was not renewed. I'm asking ORNL representatives for a sponsor.
I totally forgot about this PR!
@jgalarowicz it seems the permission issue is back: https://code.ornl.gov/ecpcitest/ecp-copa/cabana/pipelines/42620 can you have a look?
@junghans It seems like this might be the problem where each of the tests need to be a separate stage? I see that the code from 1ab6b95 that was the initial try on this. But, I don't see that code in the repository now. I remember you saying it wouldn't scale because of all the different variations that are required.
@junghans - I see the code now in the ci-cuda branch. So, maybe a different issue. Consulting with NMC - Paul and others.
What is the status of this PR? Are we still seeing issues?
@jgalarowicz is doing final tweaks!
@jgalarowicz I added a workaround for the serialization bug.
This works now: https://code.ornl.gov/ecpcitest/ecp-copa/cabana/pipelines/45674
@sslattery @dalg24 please review and merge.
@sslattery squashed and rebased.
Thanks to @jgalarowicz
Fix #67
To Do: