starpu-runtime / starpu

This is a mirror of https://gitlab.inria.fr/starpu/starpu where our development happens, but contributions are welcome here too!
https://starpu.gitlabpages.inria.fr/
GNU Lesser General Public License v2.1
58 stars 13 forks source link

dyld: Symbol not found: _starpu_mpi_world_rank #21

Closed barracuda156 closed 1 year ago

barracuda156 commented 1 year ago

I am trying to run tests with a port which depends on starpu. However that fails with missing symbol:

dyld: Symbol not found: _starpu_mpi_world_rank
  Referenced from: /opt/local/lib/libstarpu-1.4.1.dylib
  Expected in: dynamic lookup

Here what starpu itself links to:

10:~ svacchanda$ otool -L /opt/local/lib/libstarpu-1.4.dylib
/opt/local/lib/libstarpu-1.4.dylib:
    /opt/local/lib/libstarpu-1.4.1.dylib (compatibility version 2.0.0, current version 2.0.0)
    /opt/local/lib/libMacportsLegacySupport.dylib (compatibility version 1.0.0, current version 1.0.99)
    /opt/local/lib/libhwloc.15.dylib (compatibility version 22.0.0, current version 22.0.0)
    /opt/local/lib/libglpk.40.dylib (compatibility version 44.0.0, current version 44.1.0)
    /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib (compatibility version 1.0.0, current version 219.0.0)
    /opt/local/lib/libgcc/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.30.0)
    /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 751.63.0)
    /System/Library/Frameworks/IOKit.framework/Versions/A/IOKit (compatibility version 1.0.0, current version 275.0.0)
    /opt/local/lib/libgcc/libgcc_s.1.1.dylib (compatibility version 1.0.0, current version 1.1.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.2.11)

We do not build it with MPICH due to errors with it, see: https://github.com/starpu-runtime/starpu/issues/16#issuecomment-1506496995 Here is the portfile we use at the moment: https://github.com/macports/macports-ports/blob/master/devel/starpu/Portfile

sthibaul commented 1 year ago

dyld: Symbol not found: _starpu_mpi_world_rank

Could you send the output of make V=1 so we can make sure it's doing it as we expect it to?

We do not build it with MPICH due to errors with it, see: https://github.com/starpu-runtime/starpu/issues/16#issuecomment-1506496995

Isn't this already fixed in the 1.4 branch?

barracuda156 commented 1 year ago

@sthibaul Let me run the build (and tests as well), I will update soon. I will try without MPICH, as well as with MPICH (which failed last time I tried though – when I updated starpu in Macports).

UPD. So, tests are broken for the same reason:

dyld: Symbol not found: _starpu_mpi_world_rank
  Referenced from: /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_devel_starpu/starpu/work/starpu-9563a47472940f4be9f199ffba10d40ef327cb44/src/.libs/libstarpu-1.4.1.dylib
  Expected in: dynamic lookup

FAIL fault-tolerance/retry (exit status: 133)

Here is the build log: starpu_1.4_gcc12_10.6.8.log

Same story on a PowerMac, so it is neither a broken env on a specific machine nor a Rosetta issue.

barracuda156 commented 1 year ago

@sthibaul Please help with this, 1.3 was working reasonably fine, 1.4 is broken for us :(

barracuda156 commented 1 year ago

Let me try building with MPICH explicitly.

sthibaul commented 1 year ago

Here is the build log: starpu_1.4_gcc12_10.6.8.log

Ok, so the -Wl,-U -Wl,_starpu_mpi_world_rank option really is passed, but apparently that's not actually having the expected effect on macos (allow the symbol do be undefined). Do you happen to know what option we can pass to properly allow some undefined symbol on macos? Here we use a weak reference to detect whether libstarpumpi is loaded or not. We added that -U option precisely to allow this, but apparently in your case that's not working?

barracuda156 commented 1 year ago

@sthibaul Here is the build with MPICH. It actually works (in a sense of reaching completion), but same error with missing symbol. starpu_1.4_mpich_10.6.8.log

barracuda156 commented 1 year ago

Here is the build log: starpu_1.4_gcc12_10.6.8.log

Ok, so the -Wl,-U -Wl,_starpu_mpi_world_rank option really is passed, but apparently that's not actually having the expected effect on macos (allow the symbol do be undefined). Do you happen to know what option we can pass to properly allow some undefined symbol on macos? Here we use a weak reference to detect whether libstarpumpi is loaded or not. We added that -U option precisely to allow this, but apparently in your case that's not working?

@sthibaul On macOS the correct flag is -undefined dynamic_lookup. I do not think -Wl,-U is supported (not sure here, but de facto it does not anyway). P. S. No need to specify the symbol with it.

UPD. Let me try it first.

barracuda156 commented 1 year ago

No, that does not work, build break with the following then:

In file included from ../include/starpu.h:49,
                 from ../include/starpu_sched_component.h:22,
                 from sched_policies/component_worker.c:19:
../include/starpu_thread.h:338:9: error: unknown type name 'pthread_barrier_t'
  338 | typedef pthread_barrier_t starpu_pthread_barrier_t;
      |         ^~~~~~~~~~~~~~~~~
../include/starpu_thread.h:339:9: error: unknown type name 'pthread_barrierattr_t'
  339 | typedef pthread_barrierattr_t starpu_pthread_barrierattr_t;
      |         ^~~~~~~~~~~~~~~~~~~~~
../include/starpu_thread.h:378:9: error: unknown type name 'pthread_spinlock_t'
  378 | typedef pthread_spinlock_t starpu_pthread_spinlock_t;
      |         ^~~~~~~~~~~~~~~~~~
In file included from ../include/starpu.h:49,
                 from ./core/jobs.h:24,
                 from sched_policies/component_sched.c:18:
../include/starpu_thread.h:338:9: error: unknown type name 'pthread_barrier_t'
  338 | typedef pthread_barrier_t starpu_pthread_barrier_t;
      |         ^~~~~~~~~~~~~~~~~
../include/starpu_thread.h:339:9: error: unknown type name 'pthread_barrierattr_t'
  339 | typedef pthread_barrierattr_t starpu_pthread_barrierattr_t;
      |         ^~~~~~~~~~~~~~~~~~~~~
../include/starpu_thread.h:378:9: error: unknown type name 'pthread_spinlock_t'
  378 | typedef pthread_spinlock_t starpu_pthread_spinlock_t;
      |         ^~~~~~~~~~~~~~~~~~
In file included from ../include/starpu.h:49,
                 from ../include/starpu_sched_component.h:22,
                 from sched_policies/component_prio.c:17:
../include/starpu_thread.h:338:9: error: unknown type name 'pthread_barrier_t'
  338 | typedef pthread_barrier_t starpu_pthread_barrier_t;
      |         ^~~~~~~~~~~~~~~~~
../include/starpu_thread.h:339:9: error: unknown type name 'pthread_barrierattr_t'
  339 | typedef pthread_barrierattr_t starpu_pthread_barrierattr_t;
      |         ^~~~~~~~~~~~~~~~~~~~~
../include/starpu_thread.h:378:9: error: unknown type name 'pthread_spinlock_t'
  378 | typedef pthread_spinlock_t starpu_pthread_spinlock_t;
      |         ^~~~~~~~~~~~~~~~~~
In file included from ../include/starpu.h:49,
                 from ../include/schedulers/starpu_scheduler_toolbox.h:21,
                 from sched_policies/prio_deque.c:18:
../include/starpu_thread.h:338:9: error: unknown type name 'pthread_barrier_t'
  338 | typedef pthread_barrier_t starpu_pthread_barrier_t;
      |         ^~~~~~~~~~~~~~~~~
../include/starpu_thread.h:339:9: error: unknown type name 'pthread_barrierattr_t'
  339 | typedef pthread_barrierattr_t starpu_pthread_barrierattr_t;
      |         ^~~~~~~~~~~~~~~~~~~~~
../include/starpu_thread.h:378:9: error: unknown type name 'pthread_spinlock_t'
  378 | typedef pthread_spinlock_t starpu_pthread_spinlock_t;
      |         ^~~~~~~~~~~~~~~~~~
In file included from ../include/starpu.h:49,
                 from ../include/starpu_sched_component.h:22,
                 from sched_policies/component_fifo.c:18:
../include/starpu_thread.h:338:9: error: unknown type name 'pthread_barrier_t'
  338 | typedef pthread_barrier_t starpu_pthread_barrier_t;
      |         ^~~~~~~~~~~~~~~~~
../include/starpu_thread.h:339:9: error: unknown type name 'pthread_barrierattr_t'
  339 | typedef pthread_barrierattr_t starpu_pthread_barrierattr_t;
      |         ^~~~~~~~~~~~~~~~~~~~~
../include/starpu_thread.h:378:9: error: unknown type name 'pthread_spinlock_t'
  378 | typedef pthread_spinlock_t starpu_pthread_spinlock_t;
      |         ^~~~~~~~~~~~~~~~~~
In file included from ../include/starpu.h:49,
                 from ../include/starpu_sched_component.h:22,
                 from sched_policies/component_eager.c:17:
../include/starpu_thread.h:338:9: error: unknown type name 'pthread_barrier_t'
  338 | typedef pthread_barrier_t starpu_pthread_barrier_t;
      |         ^~~~~~~~~~~~~~~~~
../include/starpu_thread.h:339:9: error: unknown type name 'pthread_barrierattr_t'
  339 | typedef pthread_barrierattr_t starpu_pthread_barrierattr_t;
      |         ^~~~~~~~~~~~~~~~~~~~~
../include/starpu_thread.h:378:9: error: unknown type name 'pthread_spinlock_t'
  378 | typedef pthread_spinlock_t starpu_pthread_spinlock_t;
      |         ^~~~~~~~~~~~~~~~~~

Need some other solution, it seems.

sthibaul commented 1 year ago

that does not work

What do you mean? What did you try exactly?

sthibaul commented 1 year ago

It seems that Darwin doesn't actually support weak references, it only supports weak imports, and -undefined dynamic_lookup won't help. The previous -U option indeed doesn't work any more on Darwin. I have thus pushed just disabling the corresponding code, that should become available on github within a day.

barracuda156 commented 1 year ago

It seems that Darwin doesn't actually support weak references, it only supports weak imports, and -undefined dynamic_lookup won't help. The previous -U option indeed doesn't work any more on Darwin. I have thus pushed just disabling the corresponding code, that should become available on github within a day.

Thank you. Yeah, I tried replacing those -U flags in Makefile.am with -undefined dynamic_lookup and -flat_namespace -undefined suppress, or adding. Nothing worked.

barracuda156 commented 1 year ago

that should become available on github within a day.

@sthibaul I have made a patch and built with it. Everything works now. We only have two failures: https://github.com/starpu-runtime/starpu/issues/4#issuecomment-1548766341 And they look like not real failures but rather unsupported function, I guess?

Warning: could not get current CPU binding: Function not implemented