Open GuillaumeMercier opened 3 days ago
More information: qvi_pthread_group::rank()
throws an out_of_range
exception, indicating either that the element
is not present in the map or that we're out of bounds.
My code does this:
std::lock_guard<std::mutex> guard(m_mutex);
fprintf(stdout,"[%i]=================== Querying thread rank: ",qvi_gettid());
int rank = -1;
try{
rank = m_tid2rank.at(qvi_gettid());
}
catch(const std::out_of_range& ex)
{
fflush(stdout);
std::cout << "1) out_of_range::what(): " << ex.what() << '\n';
}
fprintf(stdout," %i\n",rank);
And the program outputs this:
[1979082]=================== Querying thread rank: [1979080]=================== Querying thread rank: [1979083]=================== Querying thread rank: [1979081]=================== Querying thread rank: 1) out_of_range::what(): 1) out_of_range::what(): map::atmap::at1) out_of_range::what(): map::at
1) out_of_range::what(): map::at
-1
-1
-1
-1
[1979083] thread_subscope sgrank is -1
[1979083] thread_subscope sgsize is 2
[1979081] thread_subscope sgrank is -1
[1979081] thread_subscope sgsize is 2
[1979082] thread_subscope sgrank is -1
[1979082] thread_subscope sgsize is 2
[1979080] thread_subscope sgrank is -1
[1979080] thread_subscope sgsize is 2
The interesting thing is the sgrank
value.
Remember that I'm splitting in 2 a scope whose size is 4.
But unless I'm mistaken the splitting operation should apply to resources, not to the group itself.
Therefore the size should remain at 4 and not 2.
So I'm enclined to think that the split semantics implemented for pthtreads is not correct.
@samuelkgutierrez : what is your take on this?
Interesting. When we split a scope, we split both the group and the parent resources. The interesting thing here is that during the split, the rank values on each side should be 0 and 1.
So, there are two issues:
1- qv_pthread_scope_split
subscope management is not correct (sgsize
should be 2 and not 4)
2- qv_scope_split
does the right thing for the size but the rank management is not correct (in the case of splitting scopes created by a call to qv_pthread_scope_split(_at)
)
For point 1: qv_pthread_scope_split
doesn't seem to call qvi_pthread_group::split
. Shouldn't it be the case though?
const uint_t group_size = k;
// Split the hardware, get the hardare pools.
qvi_hwpool **hwpools = nullptr;
int rc = qvi_hwsplit::thread_split(
this, npieces, kcolors, k, maybe_obj_type, &hwpools
);
if (rc != QV_SUCCESS) return rc;
// Split off from our parent group. This call is called from a context in
// which a process is splitting its resources across threads, so create a
// new thread group for each child.
qvi_group *thgroup = nullptr;
rc = m_group->thsplit(group_size, &thgroup);
if (rc != QV_SUCCESS) return rc;
I would expect something like rc = qvi_pthread_group::split(...)
instead of rc = m_group->thsplit(group_size, &thgroup);
. Here the group is created with a size corresponding to the total number of threads, but it's not the size
of the subgroups (as computed by qvi_pthread_group::m_subgroup_info
)
This is my understanding, but I would double check this (and please correct me if I'm wrong).
qv_pthread_scope_split
is called in the context of splitting off of a process and splitting resources among the threads that are spawned.qv_scope_split
, which should call qvi_pthread_group::split
. This raises a good point: are the names used here too confusing?
Ok, I'm definitely lost here. We need to discuss this at the next meeting. And yes, if you're right, that's very confusing on several levels.
in the
test-pthread-split.c
test program, the function called byqv_pthread_create
callsqv_scope_split
to split the the thread scope into two parts and then frees the obtained subscope immediately:Adding some calls to output informations about the subscope seem to fail:
I've got either this output:
Or a plain segfault :
This needs investigation.