oneapi-src / oneTBB

oneAPI Threading Building Blocks (oneTBB)
https://oneapi-src.github.io/oneTBB/
Apache License 2.0
5.57k stars 1.01k forks source link

Fixing onetbb build for MacOS X PPC: almost ready for PR, can someone take a look? #819

Open barracuda156 opened 2 years ago

barracuda156 commented 2 years ago

mac32-tbbmalloc-export.def has been removed at some point, apparently, and due to that building on PowerPC fails. Earlier version of tbb had that file. Is it possible to restore it?

vossmjp commented 2 years ago

oneTBB is neither officially nor community supported on mac32 any longer. If you want to restore that file, test it on your own system and submit a PR, you can do that. But we do not intend to provide ongoing support.

barracuda156 commented 2 years ago

oneTBB is neither officially nor community supported on mac32 any longer. If you want to restore that file, test it on your own system and submit a PR, you can do that. But we do not intend to provide ongoing support.

@vossmjp Thank you. I will try to do that.

barracuda156 commented 2 years ago

@vossmjp UPD. I dug into the code, and fixing PPC appears far less easy than just restoring two mac32 def files. For one thing, new version of onetbb does not account for endianness, while the older did. Also, the new version lacks machine-specific includes altogether (not just mac-ppc.h). So ppc stuff has to be somehow brought into the current code structure. I will try to deal with that in a while, and request to keep the ticket open. But it won’t be a quick patch, unfortunately.

vossmjp commented 2 years ago

Ok, no problem. The mac32 support was dropped during a major refactoring that went on between TBB and oneTBB. So yes, lets keep the issue open while you investigate further.

barracuda156 commented 2 years ago

@vossmjp Besides, do you know how opetbb got rid of all machine-specific headers? I see only one generic header in onetbb source: include/opeapi/tbb/detail/_machine.h. Earlier tbb in addition to the generic header had a lot of machine-specific ones here: include/tbb/machine.

vossmjp commented 2 years ago

@barracuda156 TBB was first released before even C++11. The refactoring done when moving to oneTBB including using some modern C++ features, such as std::atomic. Much of the machine-specific code in early versions of TBB was assembly code for atomic operations. As the project moved to modern C++ features, some of the machine-specific code was simply replaced by more portable, but still performant, C++ code. This reduced the number of machine-specific files and lines of code.

barracuda156 commented 2 years ago

@vossmjp Thank you! So those won't have to be restored even for PPC arch, given that we got C++11-supporting compilers on it (gcc10, gcc11), right?

barracuda156 commented 2 years ago

@vossmjp Sorry for a delay, been busy with other stuff.

I have built onetbb on 10.6 PPC now. Took awhile until I figured out that all atomics errors are solved by simply passing -DCMAKE_CXX_FLAGS="-std=c++17 -latomic" flag.

Aside of that, few minor patches had to be added. Lists of symbols are effectively identical, from what I can see, so my initial concern was unfounded. I had to delete two for Mac32 though:

--- src/tbb/def/mac32-tbb.def.orig  2021-12-17 21:40:54.000000000 +0800
+++ src/tbb/def/mac32-tbb.def   2022-05-04 21:35:18.000000000 +0800
@@ -58,7 +58,6 @@
 __ZTIN3tbb6detail2r110user_abortE
 __ZTVN3tbb6detail2r110user_abortE
 __ZTIN3tbb6detail2r111unsafe_waitE
-__ZTVN3tbb6detail2r111unsafe_waitE

 # RTM Mutex (rtm_mutex.cpp)
 __ZN3tbb6detail2r17acquireERNS0_2d19rtm_mutexERNS3_11scoped_lockEb
@@ -142,7 +141,6 @@

 # Concurrent bounded queue (concurrent_bounded_queue.cpp)
 __ZN3tbb6detail2r126allocate_bounded_queue_repEm
-__ZN3tbb6detail2r126wait_bounded_queue_monitorEPNS1_18concurrent_monitorEmlRNS0_2d113delegate_baseE
 __ZN3tbb6detail2r128abort_bounded_queue_monitorsEPNS1_18concurrent_monitorE
 __ZN3tbb6detail2r128deallocate_bounded_queue_repEPhm
 __ZN3tbb6detail2r128notify_bounded_queue_monitorEPNS1_18concurrent_monitorEmm

Dunno how crucial is that. (It is a bit surprising that original defs contain few duplicate symbols.)

barracuda156 commented 2 years ago

@vossmjp Here is my PR in Macports: https://github.com/macports/macports-ports/pull/14760/

If you or someone could comment on these two issues, it will be greatly appreciated. The rest should be uncontroversial, and then I can open a PR here.

  1. These assertions fail on 10.6, I had to patch them out:

    --- include/oneapi/tbb/task_group.h.orig    2021-12-17 21:40:54.000000000 +0800
    +++ include/oneapi/tbb/task_group.h    2022-05-04 15:25:15.000000000 +0800
    @@ -202,9 +202,9 @@
          bool reserved3          : 1;
          bool reserved4          : 1;
      } my_traits;
    -
    +#if __MAC_OS_X_VERSION_MAX_ALLOWED >= 1070
      static_assert(sizeof(context_traits) == 1, "Traits shall fit into one byte.");
    -
    +#endif
      static constexpr std::uint8_t may_have_children = 1;
      //! The context internal state (currently only may_have_children).
      std::atomic<std::uint8_t> my_state;
    @@ -430,9 +430,9 @@
      friend struct r1::task_group_context_impl;
      friend class task_group_base;
    }; // class task_group_context
    -
    +#if __MAC_OS_X_VERSION_MAX_ALLOWED >= 1070
    static_assert(sizeof(task_group_context) == 128, "Wrong size of task_group_context");
    -
    +#endif
    enum task_group_status {
      not_complete,
      complete,
  2. This patch was required for an older version of tbb already for 10.5.8 and 10.6:

    --- src/tbbmalloc_proxy/proxy_overload_osx.h.orig   2021-12-17 21:40:54.000000000 +0800
    +++ src/tbbmalloc_proxy/proxy_overload_osx.h   2022-05-04 14:41:59.000000000 +0800
    @@ -134,11 +134,12 @@
          introspect.force_lock = &zone_force_lock;
          introspect.force_unlock = &zone_force_unlock;
          introspect.statistics = zone_statistics;
    +#if __MAC_OS_X_VERSION_MAX_ALLOWED >= 1070
          introspect.zone_locked = &zone_locked;
          introspect.enable_discharge_checking = &impl_zone_enable_discharge_checking;
          introspect.disable_discharge_checking = &impl_zone_disable_discharge_checking;
          introspect.discharge = &impl_zone_discharge;
    -
    +#endif
          zone.size = &impl_malloc_usable_size;
          zone.malloc = &impl_malloc;
          zone.calloc = &impl_calloc;
    @@ -150,9 +151,10 @@
          zone.introspect = &introspect;
          zone.version = 8;
          zone.memalign = impl_memalign;
    +#if __MAC_OS_X_VERSION_MAX_ALLOWED >= 1070
          zone.free_definite_size = &impl_free_definite_size;
          zone.pressure_relief = &impl_pressure_relief;
    -
    +#endif
          // make sure that default purgeable zone is initialized
          malloc_default_purgeable_zone();
          void* ptr = malloc(1);
vossmjp commented 2 years ago

Yes we will take a look at it.

barracuda156 commented 2 years ago

Yes we will take a look at it.

@vossmjp Many thanks!

By the way, does it actually need 2017 C standard or 2011 suffices? Macports has 2011 in the portfile and it does indeed build with -latomic without 2017 addition, however I saw a note during the build that some inline atomics is supported only in 2017 C. This is an important detail since C standard has to be set for the port as such and not for a particular arch.

barracuda156 commented 2 years ago

UPD. While it builds on 10.6 PPC (10A190) and 10.6.8 Rosetta, the build fails on 10.5.8 with:

:info:build cd /opt/local/var/macports/build/_opt_PPCLeopardPorts_devel_onetbb/onetbb/work/build-ppc/src/tbb && /opt/local/bin/g++-mp-11 -D__TBB_BUILD -D__TBB_USE_ITT_NOTIFY -I/opt/local/var/macports/build/_opt_PPCLeopardPorts_devel_onetbb/onetbb/work/oneTBB-2021.5.0/src/tbb/../../include -latomic -arch ppc -mmacosx-version-min=10.5 -fPIC -flifetime-dse=1 -Wall -Wextra -Wfatal-errors -D_XOPEN_SOURCE -flto -std=c++11 -MD -MT src/tbb/CMakeFiles/tbb.dir/parallel_pipeline.cpp.o -MF CMakeFiles/tbb.dir/parallel_pipeline.cpp.o.d -o CMakeFiles/tbb.dir/parallel_pipeline.cpp.o -c /opt/local/var/macports/build/_opt_PPCLeopardPorts_devel_onetbb/onetbb/work/oneTBB-2021.5.0/src/tbb/parallel_pipeline.cpp
:info:build In file included from /opt/local/var/macports/build/_opt_PPCLeopardPorts_devel_onetbb/onetbb/work/oneTBB-2021.5.0/src/tbbmalloc_proxy/proxy.cpp:142:
:info:build /opt/local/var/macports/build/_opt_PPCLeopardPorts_devel_onetbb/onetbb/work/oneTBB-2021.5.0/src/tbbmalloc_proxy/proxy_overload_osx.h: In constructor 'DoMallocReplacement::DoMallocReplacement()':
:info:build /opt/local/var/macports/build/_opt_PPCLeopardPorts_devel_onetbb/onetbb/work/oneTBB-2021.5.0/src/tbbmalloc_proxy/proxy_overload_osx.h:153:14: error: 'malloc_zone_t' {aka 'struct _malloc_zone_t'} has no member named 'memalign'
:info:build   153 |         zone.memalign = impl_memalign;
:info:build       |              ^~~~~~~~
:info:build compilation terminated due to -Wfatal-errors.
:info:build make[2]: *** [src/tbbmalloc_proxy/CMakeFiles/tbbmalloc_proxy.dir/proxy.cpp.o] Error 1
:info:build make[2]: Leaving directory `/opt/local/var/macports/build/_opt_PPCLeopardPorts_devel_onetbb/onetbb/work/build-ppc'
:info:build make[1]: *** [src/tbbmalloc_proxy/CMakeFiles/tbbmalloc_proxy.dir/all] Error 2
:info:build make[1]: *** Waiting for unfinished jobs....

I will look into that.

barracuda156 commented 2 years ago

UPD2. Well, with this change to one of the patches onetbb builds on Leopard:

--- src/tbbmalloc_proxy/proxy_overload_osx.h.orig   2021-12-17 21:40:54.000000000 +0800
+++ src/tbbmalloc_proxy/proxy_overload_osx.h    2022-05-07 13:49:36.000000000 +0800
@@ -134,11 +134,12 @@
         introspect.force_lock = &zone_force_lock;
         introspect.force_unlock = &zone_force_unlock;
         introspect.statistics = zone_statistics;
+#if __MAC_OS_X_VERSION_MAX_ALLOWED >= 1070
         introspect.zone_locked = &zone_locked;
         introspect.enable_discharge_checking = &impl_zone_enable_discharge_checking;
         introspect.disable_discharge_checking = &impl_zone_disable_discharge_checking;
         introspect.discharge = &impl_zone_discharge;
-
+#endif
         zone.size = &impl_malloc_usable_size;
         zone.malloc = &impl_malloc;
         zone.calloc = &impl_calloc;
@@ -149,12 +150,17 @@
         zone.zone_name = "tbbmalloc";
         zone.introspect = &introspect;
         zone.version = 8;
+#if __MAC_OS_X_VERSION_MAX_ALLOWED >= 1060
         zone.memalign = impl_memalign;
+#endif
+#if __MAC_OS_X_VERSION_MAX_ALLOWED >= 1070
         zone.free_definite_size = &impl_free_definite_size;
         zone.pressure_relief = &impl_pressure_relief;
-
+#endif
         // make sure that default purgeable zone is initialized
+#if __MAC_OS_X_VERSION_MAX_ALLOWED >= 1060
         malloc_default_purgeable_zone();
+#endif
         void* ptr = malloc(1);
         // get all registered memory zones
         unsigned zcount = 0;

ppc64 likewise complained about undefined symbols, so an identical patch to definitions has been added. Then,

The following ports are currently installed:
  onetbb @2021.5.0_1+universal (active) requested_variants='+universal' platform='darwin 9' archs='ppc ppc64' date='2022-05-07T13:54:52+0800'
barracuda156 commented 2 years ago

@vossmjp An updated PR on Macports: https://github.com/macports/macports-ports/pull/14780 (I closed the original one due to messing up git rebase)

barracuda156 commented 2 years ago

@vossmjp Just in case, I understand you and others may not have hardware and/or time to do actual testing for PPC, so I only expect either “good to go” verdict (and then I can make PR here to onetbb) or “this particular thing won’t work correctly”. I can do all testing on my end.

barracuda156 commented 1 year ago

oneTBB is neither officially nor community supported on mac32 any longer. If you want to restore that file, test it on your own system and submit a PR, you can do that. But we do not intend to provide ongoing support.

@vossmjp Could you or someone take a look please?

https://github.com/oneapi-src/oneTBB/pull/840

nofuturre commented 1 month ago

@barracuda156 is this issue still relevant?

barracuda156 commented 1 month ago

@barracuda156 is this issue still relevant?

@nofuturre Yes, it is desirable to merge fixes for powerpc and 32-bit macOS. We have it working.