Open syoyo opened 4 years ago
Hi, Thanks for picking up my previous porting and used the same approach to get Embree3 running on ARM.
I've just built embree-aarch64 on a aarch64 server (Cortex A72, 2GHz) with GCC 9.2 and can't reproduce this behavior. My server took 1.5 minutes to compile the file. This might be a cross compiling issue or have been fixed on GCC 9.2.
@marty1885 Thanks for testing with gcc 9.2! Its good to know that it took only 1.5 mins for compiling with gcc 9.2 on native aarch64. We'd recommend to use recent gcc(and use native compilation as much as possible).
We've confirmed issue still happens in recent embree code, and things looks getting worse(e.g. gcc-10 does not finish compilation after 30 mins on Jetson AGX)
The issue would be a lot of templated code in a single file.
Splitting implementations to several .cpp files would solve the issue.
For example, reducing template instantiations like this makes compilation faster(~ 1 mins)
VirtualCurveIntersector* VirtualCurveIntersector4i()
{
static VirtualCurveIntersector function_local_static_prim;
function_local_static_prim.vtbl[Geometry::GTY_SPHERE_POINT] = SphereNiIntersectors<4>();
//function_local_static_prim.vtbl[Geometry::GTY_DISC_POINT] = DiscNiIntersectors<4>();
//function_local_static_prim.vtbl[Geometry::GTY_ORIENTED_DISC_POINT] = OrientedDiscNiIntersectors<4>();
//function_local_static_prim.vtbl[Geometry::GTY_ROUND_LINEAR_CURVE ] = LinearConeNiIntersectors<4>();
//function_local_static_prim.vtbl[Geometry::GTY_FLAT_LINEAR_CURVE ] = LinearRibbonNiIntersectors<4>();
//function_local_static_prim.vtbl[Geometry::GTY_ROUND_BEZIER_CURVE] = CurveNiIntersectors <BezierCurveT,4>();
//function_local_static_prim.vtbl[Geometry::GTY_FLAT_BEZIER_CURVE ] = RibbonNiIntersectors<BezierCurveT,4>();
//function_local_static_prim.vtbl[Geometry::GTY_ORIENTED_BEZIER_CURVE] = OrientedCurveNiIntersectors<BezierCurveT,4>();
//function_local_static_prim.vtbl[Geometry::GTY_ROUND_BSPLINE_CURVE] = CurveNiIntersectors <BSplineCurveT,4>();
//function_local_static_prim.vtbl[Geometry::GTY_FLAT_BSPLINE_CURVE ] = RibbonNiIntersectors<BSplineCurveT,4>();
//function_local_static_prim.vtbl[Geometry::GTY_ORIENTED_BSPLINE_CURVE] = OrientedCurveNiIntersectors<BSplineCurveT,4>();
//function_local_static_prim.vtbl[Geometry::GTY_ROUND_HERMITE_CURVE] = HermiteCurveNiIntersectors <HermiteCurveT,4>();
//function_local_static_prim.vtbl[Geometry::GTY_FLAT_HERMITE_CURVE ] = HermiteRibbonNiIntersectors<HermiteCurveT,4>();
//function_local_static_prim.vtbl[Geometry::GTY_ORIENTED_HERMITE_CURVE] = HermiteOrientedCurveNiIntersectors<HermiteCurveT,4>();
//function_local_static_prim.vtbl[Geometry::GTY_ROUND_CATMULL_ROM_CURVE] = CurveNiIntersectors <CatmullRomCurveT,4>();
//function_local_static_prim.vtbl[Geometry::GTY_FLAT_CATMULL_ROM_CURVE ] = RibbonNiIntersectors<CatmullRomCurveT,4>();
//function_local_static_prim.vtbl[Geometry::GTY_ORIENTED_CATMULL_ROM_CURVE] = OrientedCurveNiIntersectors<CatmullRomCurveT,4>();
return &function_local_static_prim;
}
I have split template instantiation into several .cpp files and now gcc-8 or later compiles in 3~5 mins per file on Jetson AGX(aarch64 linux)
https://github.com/lighttransport/embree-aarch64/tree/fast-compile
Another issue is it consumes lots of memory so we need to use less threads(e.g. 4 for Jetson AGX 16GB mem) to build otherwise out-of-memory error happens.
fast-compile
has been merged into aarch64-v3.12.0
. Now gcc-8 or later build goes ok with moderate compilation time(20~30 mins) on Jetson AGX 16GB.
AVX2 + arm64 build goes timeout on Travis due to the longer compilation time of curve_intersector_virtual***
(Travis has 10 mins timelimit for no output)
https://travis-ci.org/github/lighttransport/embree-aarch64/jobs/736104673
There will be no easy fix for this at the moment.
branch:
aarch64-v3.8.0
(port of intel embree v3.8.0)geometry/curve_intersector_virtual.cpp
takes too much time to compile(with-O2
) on gcc(5 mins or more even on TR 1950X cross compiling). At least I can confirm the issue with gcc 7.4 and 8.0clang(clang-9) also takes some time(~a couple of minutes) to compile
geometry/curve_intersector_virtual.cpp
We recommended to use clang for aarch64 target for a while.