libMesh / libmesh

libMesh github repository
http://libmesh.github.io
GNU Lesser General Public License v2.1
654 stars 286 forks source link

Complie Issue #2791

Open daversun opened 3 years ago

daversun commented 3 years ago

error: no matching function for call to ‘libMesh::Parallel::Communicator::broadcast(std::string&) const

I use gcc 6.2 or 9.3 on the centos7 and the kernel is 3.10.0-957.el7.x86_64

daversun commented 3 years ago

/contrib/timpi/src/parallel/include/timpi/parallel_communicator_specializations:142:88: error: no type named ‘type’ in ‘struct std::enable_if<false, int>’

jwpeterson commented 3 years ago

@roystgnr does this look like some kind of SFINAE issue to you? We definitely (claim to) support both of those compiler versions, but we haven't seen this error before, so I'm wondering if there's a difference with the std library that's on centos7 vs. the (mostly Ubuntu) machines where we do our testing...

jwpeterson commented 3 years ago

Also this line of code in parallel_communicator_specializations seems to be related to the Has_buffer_type helper class introduced by @lindsayad in libMesh/timpi@10d8f3cf, so maybe he can further comment on the error?

roystgnr commented 3 years ago

It's not an SFINAE problem, but I can't tell what's wrong without more information. The specialization at parallel_communicator_specializations:142 isn't supposed to match, but that shouldn't be important. If TIMPI detects MPI, the specialization at parallel_communicator_specializations:383 should match. If not, then the general declaration in communicator.C

...

oh, boy, maybe this is the problem? We don't have a general declaration anymore!?! Not even the no-op that we used to use when MPI wasn't detected.

It might be that @lindsayad might have broke MPI-less Communicator::broadcast(string) in https://github.com/libMesh/TIMPI/pull/34

There's no StandardType, so his first overload doesn't apply. And there's no Packing, so his second overload doesn't apply. Our tests for TIMPI alone don't try to broadcast a string, so they didn't notice the general case has been removed. And our tests (and production work) with libMesh all enable MPI, so they use the specialization on 383. Except ... I could have sworn we had --disable-mpi testing in one of our CI tests. Do we not? I'll try a build myself now, regardless.

roystgnr commented 3 years ago

Yeah, I see the exact same bug. @daversun, for now your best workaround is probably to install and compile against an MPI implementation. There's almost no overhead to it even if you're only actually running on only one processor. This is quite an embarrassing bug but it might take a while for us to get a fix together; I'm already running behind on other work this week and it's only Tuesday...

lindsayad commented 3 years ago

I have one thing I have to finish today, but I could look into it right after that is done

On Nov 10, 2020, at 7:52 AM, roystgnr notifications@github.com wrote:

 Yeah, I see the exact same bug. @daversun, for now your best workaround is probably to install and compile against an MPI implementation. There's almost no overhead to it even if you're only actually running on only one processor. This is quite an embarrassing bug but it might take a while for us to get a fix together; I'm already running behind on other work this week and it's only Tuesday...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

jwpeterson commented 3 years ago

Looks like libMesh/TIMPI#34 first appeared in TIMPI v1.3, so it post-dates the libmesh-1.6 release, which uses TIMPI v1.2.1. So another possibility for @daversun would be to use the 1.6.0 release instead of master.