Closed dongahn closed 4 years ago
@SteVwonder and @milroy: I added you as the reviewers for this.
Although this PR spans multiple files and many lines, a review should be relatively painlessly. This essentially replaces raw pointers to shared_ptr
objects: mostly drop-in replacements. But note that there are a few places like "destroy" logic for resource contexts within resource_match.cpp
and resource-query.cpp
, which would need some good review, though.
Others are style changes, log message changes and applying a bit better practice of using explicit namespace paths within std
.
Two Travis test instances failed. Let me look at them before your reviews.
qmanager: loading resource and qmanager modules works
expecting success:
jobid=$(flux job submit basic.json) &&
flux job wait-event -t 2 ${jobid} start &&
flux job wait-event -t 2 ${jobid} finish &&
flux job wait-event -t 2 ${jobid} release &&
flux job wait-event -t 2 ${jobid} clean
flux-job: flux_open: Connection reset by peer
not ok 4 - qmanager: basic job runs in simulated mode
expecting success:
jobid=$(flux jobspec srun -t 1 hostname | \
exec_test | flux job submit) &&
flux job wait-event -vt 2.5 ${jobid} start &&
flux job cancel ${jobid} &&
flux job wait-event -t 2.5 ${jobid} exception &&
flux job wait-event -t 2.5 ${jobid} finish | grep status=15 &&
flux job wait-event -t 2.5 ${jobid} release &&
flux job wait-event -t 2.5 ${jobid} clean &&
exec_eventlog $jobid | grep "complete" | grep "\"status\":15"
flux-job: flux_open: No such file or directory
not ok 5 - qmanager: canceling job during execution works
flux-job: flux_open: No such file or directory
This probably means the flux instance crashed...?
$ src/test/backtrace-all.sh
This doesn't report anything though...
Yeah, unfortunately it has been a long standing problem that Travis can't generate corefiles in docker images.
@grondo: ah...
Thankfully I just reproduced this on my docker using clang-6.0!
Whew, otherwise the approach is typically a series of printf
debugging via pushes to get a Travis build. Not fun!
Merging #537 into master will decrease coverage by
0.44%
. The diff coverage is68.93%
.
@@ Coverage Diff @@
## master #537 +/- ##
==========================================
- Coverage 76.03% 75.59% -0.45%
==========================================
Files 69 69
Lines 6481 6515 +34
==========================================
- Hits 4928 4925 -3
- Misses 1553 1590 +37
Impacted Files | Coverage Δ | |
---|---|---|
resource/policies/dfu_match_policy_factory.hpp | 100% <ø> (ø) |
:arrow_up: |
resource/readers/resource_reader_hwloc.hpp | 100% <ø> (ø) |
:arrow_up: |
resource/readers/resource_reader_grug.hpp | 100% <ø> (ø) |
:arrow_up: |
resource/writers/match_writers.hpp | 100% <ø> (ø) |
:arrow_up: |
resource/utilities/command.hpp | 100% <ø> (ø) |
:arrow_up: |
resource/libjobspec/jobspec.hpp | 100% <ø> (ø) |
:arrow_up: |
resource/store/resource_graph_store.hpp | 100% <ø> (ø) |
:arrow_up: |
resource/readers/resource_reader_jgf.hpp | 100% <ø> (ø) |
:arrow_up: |
resource/readers/resource_reader_base.hpp | 100% <ø> (ø) |
:arrow_up: |
resource/traversers/dfu_impl.hpp | 100% <ø> (ø) |
:arrow_up: |
... and 17 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 8342b7d...fd9207a. Read the comment docs.
Ugg... Docker hub service availability issue?
Building image bionic for user travis 2000 group=20000.50s
482Sending build context to Docker daemon 2.56kB
483Step 1/9 : FROM fluxrm/flux-core:bionic
484received unexpected HTTP status: 503 Service Unavailable
485docker-run-checks.sh: docker build failed
Ugg... Docker hub service availability issue?
Looks that way. Just try restarting the builder...
@grondo: OK. will restart.
FYI -- Seems the issue was that shared_ptr
C++ smart pointers don't directly translate to C-style void *
-based callbacks.
You can't directly pass a shared_ptr
object to void *
as if it is a raw pointer.
You can take the address of a shared_ptr
object and pass this raw pointer into void *
so that you can later retrieve and cast it back to share_ptr
on each callback invocation. But the caveat is, you have to make sure that this shared_ptr
object must be valid all the time. I was passing a temporary shared_ptr
object into flux_aux_set
which then led to an undefined behavior.
I moved flux_aux_set
from getctx
to mod_main
and passed its resource context shared_ptr
into this API. This should be ok as each object in mod_main
's stack frame should be valid as far as the resource module is active.
We'll see if Travis likes it this time.
This should be good to review now. Code coverage failures should be ignored. I glanced at them and the coverage slightly decreased because I add more lines in the exceptional conditions as part of code cleanup and use of C++ exceptions...
FYI -- Seems the issue was that shared_ptr C++ smart pointers don't directly translate to C-style void *-based callbacks.
@trws and I discussed this a bit after our meeting. This is a generic problem for C++ code interfacing with C callbacks in flux-core. I will create at a ticket so that @trws can propose something to solve this generally. The current code should serve its purpose in the meantime.
@trws and I discussed this a bit after our meeting. This is a generic problem for C++ code interfacing with C callbacks in flux-core. I will create at a ticket so that @trws can propose something to solve this generally. The current code should serve its purpose in the meantime.
@dongahn : at the end of #538, you mentioned that you would consider @trws' suggestion with a fresh set of eyes. Do you plan to add that suggestion to this PR? Or in a separate PR? Just wondering if I should review this as-is or wait for more updates.
@SteVwonder: Thank. I thought about this and it would be best to not to add shared pointer for the context piece. Because the changes will be substantial, it would be best to close this and do another PR.
@milroy: I will look at your review and address that in a new PR as well.
@SteVwonder: Thank. I thought about this and it would be best to not to add shared pointer for the context piece. Because the changes will be substantial, it would be best to close this and do another PR.
Unfortunately, backing this up will be too disruptive for this PR and my other PR (#543). My proposal is to update this PR with minor changes requested by @milroy and merge it in so that we can quickly get to #541
C/C++ compatibility will be handled in a more robust way with @trws' PR #542.
@milroy: I add two additional commits that address your review points. If you are okay with it and @SteVwonder has no other comments, this can go in after squashing these new commits. Let me know.
@milroy: I add two additional commits that address your review points. If you are okay with it and @SteVwonder has no other comments, this can go in after squashing these new commits. Let me know.
@dongahn I've looked over your responses and this PR is ready to merge as far as I'm concerned.
@dongahn I've looked over your responses and this PR is ready to merge as far as I'm concerned.
Sounds good. I will squash those two latest commits and I think this can go in.
I will squash those two latest commits and I think this can go in.
@milroy: those commits are squashed. Once Travis CI is happy, this can be merged.
Merging since @milroy and I have both reviewed and approved (annoyingly I cannot click the “approve” button on github mobile).
Thanks @dongahn!
Thank you!
I lied. Apparently I cannot merge a PR on mobile when there is a failing test (coveralls, which we said was ok).
If no one else beats me to it, I’ll click the button tomorrow.
Thanks @trws and @dongahn!
This PR adds smart pointer support using
std::shared_ptr
to:This PR also adds a few other cleanups:
std
namespace paths for all of the symbols in this namespace;resource
module to include the function name in the message, which contains that log message statement;