OP2 / PyOP2

Framework for performance-portable parallel computations on unstructured meshes
https://op2.github.io/PyOP2
Other
80 stars 35 forks source link

`PYOP2_NODE_LOCAL_COMPILATION=0` invokes compiler on non-root ranks #719

Closed angus-g closed 5 months ago

angus-g commented 6 months ago

It's my understanding that the default for the node_location_configuration is True, i.e. all ranks will invoke the compiler inside pyop2.compilation.Compiler.get_so(). However, when this setting is False, only rank 0 will invoke the compiler, and the assumption is that all other ranks are able to access the outputs from this. (As an aside, this doesn't seem to capture the scenario of running on only one rank of a physical node when running across nodes with MPI?)

For context, we're running Firedrake with a fairly large number of ranks, and there's a significant disk impact when all these ranks simultaneously invoke the compiler for version sniffing. As far as I can tell, there are two places this happens:

  1. Within pyop2.compilation.sniff_compiler() itself, as called from pyop2.compilation.load(). In this case, the non-compiling ranks can probably just use the AnonymousCompiler or Compiler base class: they just need to wait for the MPI barrier to signal the shared library is available, and load it. I think in this case, the defaults are such that Compiler.sniff_compiler_version() will simply run into an exception because no compiler is defined.
  2. If, e.g. pyop2.compilation.set_default_compiler(pyop2.compilation.LinuxGnuCompiler) is called to avoid the above call to sniff_compiler(), the compiler is still invoked on all ranks: Compiler.__init__() calls Compiler.sniff_compiler_version(), which will now have a valid executable.

As far as I can tell, Compiler.sniff_compiler_version() is only required for cflags patching for certain (very old) compiler versions. In order not to incur the wrath of our HPC sysadmins, I've resorted to the following preamble to patch out all the sniffing behaviour:

import pyop2.compilation

class C(pyop2.compilation.LinuxGnuCompiler):
    def sniff_compiler_version(self, *args): pass

    @property
    def bugfix_cflags(self): return ()

pyop2.compilation.set_default_compiler(C)

However I think the core issue is that, in my opinion, the compiler simply doesn't need to be run on non-compiling ranks.

connorjward commented 6 months ago

This makes sense, though I think from a style perspective that non-compiling ranks should not be creating Compiler instances at all. @JDBetteridge is the expert on this corner of PyOP2 so it would be good to get his feedback once he's back from holiday.

JDBetteridge commented 6 months ago

This should be a straightforward fix, thanks for pointing it out. The version sniffing is used for the repr and is incredibly useful for debugging.

You are right the Compiler shouldn't need to be instantiated on non-compiling ranks, the get_so being a method of the compiler is a legacy feature and is only invoked in one place anyway. I will work on a quick fix that broadcasts the version over the compilation comm rather than making a subprocess call on all ranks.

JDBetteridge commented 6 months ago

@angus-g does PR #720 fix this issue for you?

angus-g commented 5 months ago

Sorry for the huge delay @JDBetteridge! That PR does indeed do the job on our MPI runs.

JDBetteridge commented 5 months ago

Excellent, I'll try and get this merged in this afternoon :slightly_smiling_face: