flatsurf / flatsurvey

4 stars 1 forks source link

Improve runtime of Boshernitzan conjecture check #33

Open saraedum opened 2 years ago

saraedum commented 2 years ago

We would like to go much further in the check of Boshernitzan's conjecture (b). Currently, there are several things that are limiting the runtime here. Here's the output of time py-spy top --native --rate 10 -- python -m flatsurvey.worker ngon -a 9 -a 37 -a 227 boshernitzan-conjecture-orientations boshernitzan-conjecture

  0.00% 100.00%   0.000s    581.8s   _run_code (runpy.py:87)
  0.00% 100.00%   0.000s    581.7s   _process_result (click/core.py:1626)
  0.00% 100.00%   0.000s    581.7s   <module> (flatsurvey/worker/__main__.py:24)
  0.00% 100.00%   0.000s    581.7s   __call__ (click/core.py:1130)
  0.00% 100.00%   0.000s    581.7s   invoke (click/core.py:1689)
  0.00% 100.00%   0.000s    581.7s   main (click/core.py:1055)
  0.00% 100.00%   0.000s    581.7s   invoke (click/core.py:760)
  0.00% 100.00%   0.000s    581.6s   _run_once (asyncio/base_events.py:1905)
  0.00% 100.00%   0.000s    581.6s   run (asyncio/runners.py:44)
  0.00% 100.00%   0.000s    581.6s   run_until_complete (asyncio/base_events.py:634)
  0.00% 100.00%   0.000s    581.6s   start (flatsurvey/worker/worker.py:234)
  0.00% 100.00%   0.000s    581.6s   _run (asyncio/events.py:80)
  0.00% 100.00%   0.000s    581.6s   process (flatsurvey/worker/worker.py:161)
  0.00% 100.00%   0.000s    581.6s   produce (flatsurvey/pipeline/processor.py:95)
  0.00% 100.00%   0.000s    581.6s   run_forever (asyncio/base_events.py:601)
  0.00% 100.00%   0.000s    581.6s   resolve (flatsurvey/pipeline/consumer.py:176)
  0.00%   0.00%   0.000s    477.3s   task_step (_asynciomodule.c:2969)
  0.00%   0.00%   0.000s    477.3s   task_step_impl (_asynciomodule.c:2669)
  0.00% 100.00%   0.000s    376.1s   consume (flatsurvey/pipeline/consumer.py:137)
  0.00% 100.00%   0.000s    376.1s   produce (flatsurvey/pipeline/producer.py:91)
  0.00% 100.00%   0.000s    376.1s   _notify_consumers (flatsurvey/pipeline/producer.py:128)
  0.00%  90.91%   0.000s    376.0s   _consume (flatsurvey/jobs/flow_decomposition.py:135)
  0.00%   0.00%   0.000s    335.8s   opposite_edge (flatsurf/geometry/minimal_cover.py:81)
  0.00%   0.00%   0.000s    335.5s   edge_matrix (flatsurf/geometry/similarity_surface.py:433)
  0.00%   0.00%   0.000s    335.1s   __invert__ (sage/rings/number_field/number_field_element.cpython-39-x86_64-linux-gnu.so)
  0.00%   0.00%   0.000s    321.1s   __call__ (sage/misc/cachefunc.cpython-39-x86_64-linux-gnu.so)
  0.00%   0.00%   0.000s    321.1s   __Pyx_PyFunction_FastCallDict.constprop.0 (sage/misc/cachefunc.cpython-39-x86_64-linux-g
  0.00%   0.00%   0.000s    321.1s   __Pyx_PyFunction_FastCallNoKw (sage/misc/cachefunc.cpython-39-x86_64-linux-gnu.so)
  0.00%   0.00%   0.000s    320.0s   NumberFieldElement__div_ (sage/rings/number_field/number_field_element.cpython-39-x86_64
  0.00%   0.00%   0.000s    320.0s   __truediv__ (sage/structure/element.cpython-39-x86_64-linux-gnu.so)
  0.00%   0.00%   0.000s    300.6s   NTL::IsZero (ZZ.h:312)
  0.00%   0.00%   0.000s    300.5s   NTL::XGCD (ZZX1.cpp:3284)
  0.00%   0.00%   0.000s    292.0s   orbit_closure (flatsurvey/surfaces/surface.py:79)
  0.00%   0.00%   0.000s    282.4s   NTL::resultant (ZZX1.cpp:3164)
  0.00%   0.00%   0.000s    282.4s   NTL::resultant (lzz_pX1.cpp:2121)
  0.00%   0.00%   0.000s    275.8s   NTL::swap (lzz_pX.h:227)
  0.00%   0.00%   0.000s    275.8s   NTL::zz_pX::swap (lzz_pX.h:129)
  0.00%   0.00%   0.000s    263.9s   NTL::PlainResultant (lzz_pX1.cpp:1843)
  0.00%   0.00%   0.000s    205.5s   produce (flatsurvey/pipeline/producer.py:86)
  0.00%   0.00%   0.000s    205.5s   _produce (flatsurvey/jobs/boshernitzan_conjecture_orientations.py:362)
  0.00%   0.00%   0.000s    167.8s   similarity_from_vectors (flatsurf/geometry/matrix_2x2.py:181)
  0.00%   0.00%   0.000s    163.9s   similarity_from_vectors (flatsurf/geometry/matrix_2x2.py:182)
  0.00%   0.00%   0.200s    147.9s   NTL::InvMod (ZZ.cpp:364)
  0.00%   0.00%   0.100s    146.5s   NTL::inv (lzz_p.h:399)
  0.00%   0.00%   0.000s    139.6s   NTL::PlainRem (lzz_pX.cpp:1061)
  0.00%   0.00%   136.2s    136.2s   NTL::XGCD (ZZ.cpp:323)
  0.00%   0.00%   0.000s    128.7s   surface (flatsurvey/surfaces/surface.py:124)
  0.00%   0.00%   0.000s    128.7s   erase_marked_points (flatsurf/geometry/translation_surface.py:530)
  0.00%   0.00%   0.000s    126.9s   _directions (flatsurvey/jobs/boshernitzan_conjecture_orientations.py:205)
  0.00%   0.00%   0.000s    126.8s   unfolding_symmetries (flatsurvey/surfaces/ngons.py:264)
  0.00%   0.00%   0.000s    126.8s   __next__ (flatsurf/geometry/surface.py:1401)
  0.00%   0.00%   0.000s    122.8s   __init__ (flatsurf/geometry/gl2r_orbit_closure.py:149)
  0.00%   0.00%   0.000s    118.9s   find_a_new_label (flatsurf/geometry/surface.py:1480)
  0.00%   0.00%   0.000s    117.2s   to_pyflatsurf (flatsurf/geometry/pyflatsurf_conversion.py:93)
  0.00%   0.00%   0.000s    117.2s   copy (flatsurf/geometry/similarity_surface.py:757)
  0.00%   0.00%   0.000s    117.2s   triangulate (flatsurf/geometry/similarity_surface.py:1566)
  0.00%   0.00%   0.000s    114.3s   __init__ (flatsurf/geometry/surface.py:1156)
  0.00%   0.00%   0.000s    113.4s   edge_gluing_iterator (flatsurf/geometry/surface.py:298)
  0.00%   0.00%   0.000s    109.6s   angles (flatsurf/geometry/half_translation_surface.py:110)
  0.00%   0.00%   0.000s    109.6s   opposite_edge (flatsurf/geometry/similarity_surface.py:229)
  0.00%   0.00%   0.000s    66.60s   __mul__ (sage/structure/element.cpython-39-x86_64-linux-gnu.so)
  0.00%   0.00%   0.100s    60.60s   NumberFieldElement__mul_ (sage/rings/number_field/number_field_element.cpython-39-x86_64
  0.00%   0.00%   0.000s    59.20s   NTL::PlainRem (lzz_pX.cpp:1088)
  0.00%  90.91%   0.000s    55.30s   decomposition (flatsurf/geometry/gl2r_orbit_closure.py:676)
  0.00%  90.91%   0.000s    55.30s   decomposeFlowDecomposition (pyflatsurf/cppyy_flatsurf.py:85)
  0.00%  90.91%   0.000s    55.30s   <lambda> (cppyythonizations/util/__init__.py:154)
  0.00%  90.91%   0.000s    53.20s   flatsurf::FlowComponent<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::decompos
  0.00%  90.91%   0.000s    53.20s   flatsurf::FlowDecomposition<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::deco
  0.00%  90.91%   0.000s    53.20s   0x7fe7864e5106 (?)
  0.00%  90.91%   0.000s    52.60s   intervalxt::Component::decompositionStep (libintervalxt.so.4.2.0)
  0.00%  90.91%   0.000s    52.00s   intervalxt::ImplementationOf<intervalxt::Component>::boshernitzanCost (libintervalxt.so.
  0.00%  90.91%   0.000s    51.60s   intervalxt::IntervalExchangeTransformation::safInvariant (libintervalxt.so.4.2.0)
  0.00%  63.64%   0.400s    51.20s   intervalxt::ImplementationOf<intervalxt::IntervalExchangeTransformation>::saf (libinterv
  0.00%   0.00%   0.000s    43.00s   _directions (flatsurvey/jobs/boshernitzan_conjecture_orientations.py:180)
  0.00%   0.00%   42.50s    42.50s   NTL::sp_CorrectExcess (sp_arith.h:214)
  0.00%   0.00%   0.000s    39.90s   __Pyx_PyObject_CallNoArg (sage/rings/number_field/number_field_element.cpython-39-x86_64
  0.00%   0.00%   0.000s    38.30s   charpoly (sage/rings/number_field/number_field_element.cpython-39-x86_64-linux-gnu.so)
  0.00%   0.00%   0.000s    38.30s   minpoly (sage/rings/number_field/number_field_element.cpython-39-x86_64-linux-gnu.so)
  0.00%   0.00%   0.100s    37.30s   CoercionModel_bin_op (sage/structure/coerce.cpython-39-x86_64-linux-gnu.so)
  0.00%   0.00%   0.000s    36.60s   charpoly (matrix_integer_dense.cpp:13448)
  0.00%   0.00%   0.000s    36.60s   charpoly (matrix_rational_dense.cpp:10790)
  0.00%   0.00%   0.000s    36.60s   std::__atomic_base<int>::operator int (atomic_base.h:292)
  0.00%   0.00%   0.000s    36.60s   _sig_off_ (macros.h:190)
  0.00%   0.00%   0.000s    36.60s   std::__atomic_base<int>::load (atomic_base.h:436)
  0.00%   0.00%   0.000s    36.60s   charpoly (matrix_rational_dense.cpp:11028)
  0.00%   0.00%   0.000s    36.50s   LinBox::charpoly<LinBox::BlasMatrix<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::
  0.00%   0.00%   0.000s    36.50s   linbox_flint_interface_linbox_fmpz_mat_charpoly (linbox_flint_interface.cpp:1727)
  0.00%   0.00%   0.000s    36.50s   LinBox::charpoly<LinBox::BlasMatrix<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::
  0.00%   0.00%   0.000s    36.50s   LinBox::charpoly<LinBox::BlasMatrix<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::
  0.00%   0.00%   0.000s    36.20s   LinBox::cia<LinBox::DensePolynomial<Givaro::ZRing<Givaro::Integer> >, LinBox::BlasMatrix
  0.00%   0.00%   0.000s    36.20s   LinBox::ChineseRemainderSequential<LinBox::CRABuilderFullMultip<Givaro::ModularBalanced<
  0.00%   0.00%   0.000s    36.20s   LinBox::minpoly<LinBox::DensePolynomial<Givaro::ZRing<Givaro::Integer> >, LinBox::BlasMa
  0.00%   0.00%   0.000s    35.50s   LinBox::ChineseRemainderSequential<LinBox::CRABuilderFullMultip<Givaro::ModularBalanced<
  0.00%   0.00%   0.000s    33.10s   NTL::MulMod (ZZX1.cpp:3423)
  0.00%   0.00%   0.000s    32.80s   LinBox::VectorDomainBase<Givaro::ModularBalanced<double> >::~VectorDomainBase (vector-do
  0.00%   0.00%   0.000s    32.80s   LinBox::IntegerModularMinpoly<LinBox::BlasMatrix<Givaro::ZRing<Givaro::Integer>, std::ve
  0.00%   0.00%   0.000s    32.80s   LinBox::DotProductDomain<Givaro::ModularBalanced<double> >::~DotProductDomain (vector-do
  0.00%   0.00%   0.000s    32.80s   LinBox::BlasMatrix<Givaro::ModularBalanced<double>, std::vector<double, std::allocator<d
  0.00%   0.00%   0.000s    32.80s   LinBox::VectorDomain<Givaro::ModularBalanced<double> >::~VectorDomain (vector-domain.h:1
  0.00%   0.00%   0.000s    32.70s   <genexpr> (flatsurf/geometry/subfield.py:153)
  0.00%   0.00%   0.000s    32.70s   __init__ (flatsurf/geometry/polygon.py:2072)
  0.00%   0.00%   0.000s    32.70s   subfield_from_elements (flatsurf/geometry/subfield.py:153)
  0.00%   0.00%   0.000s    32.40s   LinBox::BlasMatrixDomainMinpoly<Givaro::ModularBalanced<double>, LinBox::DensePolynomial
  0.00%   0.00%   0.000s    32.40s   LinBox::BlasMatrixDomain<Givaro::ModularBalanced<double> >::minpoly<LinBox::DensePolynom
  0.00%   0.00%   0.000s    32.40s   LinBox::minpoly<LinBox::DensePolynomial<Givaro::ModularBalanced<double> >, LinBox::BlasM
  0.00%   0.00%   0.000s    32.40s   FFPACK::MinPoly<Givaro::ModularBalanced<double>, LinBox::DensePolynomial<Givaro::Modular
  0.00%   0.00%   0.000s    32.30s   FFPACK::MinPoly<Givaro::ModularBalanced<double>, LinBox::DensePolynomial<Givaro::Modular
  0.00%   0.00%   0.000s    32.30s   FFPACK::MatVecMinPoly<Givaro::ModularBalanced<double>, LinBox::DensePolynomial<Givaro::M
  0.00%   0.00%   0.000s    32.20s   std::vector<double, std::allocator<double> >::resize (stl_vector.h:939)
  0.00%   0.00%   0.000s    32.20s   std::vector<double, std::allocator<double> >::size (stl_vector.h:919)
  0.00%   0.00%   0.000s    32.20s   FFPACK::Protected::MatVecMinPoly<Givaro::ModularBalanced<double>, LinBox::DensePolynomia
  0.00%   0.00%   0.000s    30.70s   NTL::zz_pX::operator= (lzz_pX.h:59)
  0.00%   0.00%   0.000s    30.10s   polygon (flatsurf/geometry/minimal_cover.py:76)
  0.00%   0.00%   0.000s    29.30s   NTL::XGCD (ZZX1.cpp:3334)
  0.00%   0.00%   0.000s    29.10s   NTL::PlainRem (lzz_pX.cpp:1089)
  0.00%   0.00%   0.000s    29.00s   NTL::XGCD (lzz_pX1.cpp:611)
  0.00%   0.00%   29.00s    29.00s   sched_yield (libc-2.28.so)
  0.00%   0.00%   28.80s    28.80s   NTL::MulModPrecon (sp_arith.h:786)
  0.00%   0.00%   0.000s    28.60s   decomposition (flatsurf/geometry/gl2r_orbit_closure.py:672)
  0.00%   0.00%   0.000s    27.90s   NTL::ZZVec::~ZZVec (ZZVec.h:45)
  0.00%  36.36%    3.70s    27.30s   __gmpq_aors (libgmp.so.10.4.1)
  0.00%   0.00%   0.000s    26.70s   __Pyx_PyFunction_FastCallDict.constprop.0 (sage/categories/action.cpython-39-x86_64-linu
  0.00%   0.00%   0.000s    26.70s   Action__act_ (sage/categories/action.cpython-39-x86_64-linux-gnu.so)
  0.00%   0.00%   0.100s    26.70s   zhemm3m_thread_LL (libopenblasp-r0.3.20.so)
  0.00%   0.00%   0.100s    26.20s   NTL::Vec<NTL::zz_p>::~Vec (vector.h:282)
  0.00%   0.00%   0.000s    25.90s   FFLAS::fgemv<Givaro::ModularBalanced<double> > (libfflas.so.1.0.0:358)
  0.00%   0.00%   0.000s    25.90s   FFPACK::Protected::LUdivine_construct<Givaro::ModularBalanced<double> > (ffpack_ludivine
  0.00%   0.00%   0.000s    25.90s   FFLAS::fgemv<Givaro::ModularBalanced<double> > (libfflas.so.1.0.0:179)
  0.00%   0.00%   0.000s    25.90s   FFLAS::fassign<Givaro::ModularBalanced<double> > (fflas_fassign.inl:137)
  0.00%   0.00%   0.000s    25.80s   spotf2_ (libopenblasp-r0.3.20.so)
  0.00%   0.00%   0.000s    25.70s   FFLAS::fgemv<Givaro::ModularBalanced<double> > (libfflas.so.1.0.0:306)
  0.00%   0.00%   0.000s    25.70s   FFLAS::Protected::ScalAndReduce<Givaro::ModularBalanced<double>, FFLAS::MMHelperAlgo::Cl
  0.00%   0.00%   0.100s    25.50s   dsbmv_L (libopenblasp-r0.3.20.so)
  0.00%   0.00%   0.000s    24.80s   NTL::zz_pX::~zz_pX (lzz_pX.h:59)

We'll track progress on improving the above in this issue.


Improvements for the worker:
Improvements for the scheduler:
saraedum commented 2 years ago

@videlec we have two strange regreessions here.

The scheduler now reports that it does a lot of minpoly in subfield_from_elements (this is for a+b+c odd and ≥ 113.)

  0.00%  19.00%   0.000s    293.8s   <module> (flatsurvey/__main__.py:399)
  0.00%  19.00%   0.000s    293.8s   invoke (click/core.py:760)
  0.00%  19.00%   0.000s    293.8s   run (asyncio/runners.py:44)
  0.00%  19.00%   0.000s    293.8s   __call__ (click/core.py:1130)
  0.00%  19.00%   0.000s    293.8s   _process_result (click/core.py:1626)
  0.00%  19.00%   0.000s    293.8s   main (click/core.py:1055)
  0.00%  19.00%   0.000s    293.8s   _run_module_as_main (runpy.py:197)
  0.00%  19.00%   0.000s    293.8s   invoke (click/core.py:1689)
  0.00%  19.00%   0.000s    293.8s   _run_code (runpy.py:87)
  0.00%  19.00%   0.000s    293.8s   run_forever (asyncio/base_events.py:601)
  0.00%  19.00%   0.000s    293.8s   run_until_complete (asyncio/base_events.py:634)
  0.00%  19.00%   0.000s    293.8s   process (flatsurvey/__main__.py:160)
  0.00%  19.00%   0.000s    293.8s   _run_once (asyncio/base_events.py:1905)
  0.00%  19.00%   0.020s    293.8s   _run (asyncio/events.py:80)
  0.00%  19.00%   0.000s    293.0s   start (flatsurvey/__main__.py:224)
  0.00%  18.00%   0.190s    292.4s   command (flatsurvey/surfaces/surface.py:152)
  0.00%  18.00%   0.000s    292.4s   _render_command (flatsurvey/__main__.py:309)
  0.00%  18.00%   0.000s    292.2s   __reduce__ (flatsurvey/surfaces/ngons.py:575)
  0.00%  18.00%   0.050s    277.7s   _lengths (flatsurvey/surfaces/ngons.py:484)
  0.00%   0.00%   0.000s    173.3s   __init__ (flatsurf/geometry/polygon.py:2072)
  0.00%   0.00%   0.010s    173.2s   subfield_from_elements (flatsurf/geometry/subfield.py:153) <-- HERE
  0.00%   0.00%   173.0s    173.2s   <genexpr> (flatsurf/geometry/subfield.py:153)
  0.00%  18.00%   0.030s    105.1s   NumberField (sage/rings/number_field/number_field.py:569)
  0.00%  18.00%   0.000s    102.9s   __init__ (flatsurf/geometry/polygon.py:2052)
  0.00%  11.00%   0.000s    98.83s   _richcmp_ (sage/rings/qqbar.py:5458)
  0.00%  11.00%   0.000s    84.91s   sign (sage/rings/qqbar.py:5825)
  0.00%  11.00%   0.000s    84.90s   minpoly (sage/rings/qqbar.py:4496)
 11.00%  11.00%   84.77s    84.90s   minpoly (sage/rings/qqbar.py:7769)
  0.00%   0.00%    2.83s    14.05s   cmp_elements_with_same_minpoly (sage/rings/qqbar.py:2863)
  0.00%   0.00%   0.040s    13.92s   sign (sage/rings/qqbar.py:5828)
  0.00%   5.00%   0.420s    12.02s   refine_interval (sage/rings/qqbar.py:7047)
  0.00%   0.00%   0.000s    10.68s   polynomial_root (sage/rings/qqbar.py:1397)
  0.00%   0.00%   0.000s    10.59s   __init__ (sage/rings/qqbar.py:6883)

I see lots of kernel activity as if some thrashing was going on. Is minpoly somehow multithreaded and something going terribly wrong here? Maybe, setting MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 SAGE_NUM_THREADS=1 it still spends a lot of time in minpoly but it's much faster overall.


At the same time, all the worker does is erasing marked points (1, 13, 63):

  0.00% 100.00%   0.000s    350.4s   resolve (flatsurvey/pipeline/consumer.py:170)
  0.00% 100.00%   0.000s    350.4s   process (flatsurvey/worker/__main__.py:137)
  0.00% 100.00%   0.000s    350.4s   runctx (profile.py:62)
  0.00% 100.00%   0.000s    350.4s   start (flatsurvey/worker/__main__.py:208)
  0.00% 100.00%   0.000s    350.4s   _run (asyncio/events.py:80)
  0.00% 100.00%   0.000s    350.4s   run_module (runpy.py:213)
  0.00% 100.00%   0.000s    350.4s   _run_once (asyncio/base_events.py:1905)
  0.00% 100.00%   0.000s    350.4s   main (click/core.py:1055)
  0.00% 100.00%   0.000s    350.4s   _run_module_as_main (runpy.py:197)
  0.00% 100.00%   0.000s    350.4s   invoke (click/core.py:760)
  0.00% 100.00%   0.000s    350.4s   consume (flatsurvey/pipeline/consumer.py:130)
  0.00% 100.00%   0.000s    350.4s   runctx (cProfile.py:100)
  0.00% 100.00%   0.000s    350.4s   <module> (cProfile.py:190)
  0.00% 100.00%   0.000s    350.4s   produce (flatsurvey/pipeline/producer.py:84)
  0.00% 100.00%   0.000s    350.4s   orbit_closure (flatsurvey/surfaces/surface.py:74)
  0.00% 100.00%   0.000s    350.4s   run_until_complete (asyncio/base_events.py:634)
  0.00% 100.00%   0.000s    350.4s   _consume (flatsurvey/jobs/flow_decomposition.py:120)
  0.00% 100.00%   0.000s    350.4s   surface (flatsurvey/surfaces/surface.py:119)
  0.00% 100.00%   0.000s    350.4s   _process_result (click/core.py:1626)
  0.00% 100.00%   0.000s    350.4s   _run_code (runpy.py:87)
  0.00% 100.00%   0.000s    350.4s   runctx (cProfile.py:19)
  0.00% 100.00%   0.000s    350.4s   _notify_consumers (flatsurvey/pipeline/producer.py:121)
  0.00% 100.00%   0.000s    350.4s   main (cProfile.py:179)
  0.00% 100.00%   0.000s    350.4s   run_forever (asyncio/base_events.py:601)
  0.00% 100.00%   0.000s    350.4s   __call__ (click/core.py:1130)
  0.00% 100.00%   0.000s    350.4s   run (asyncio/runners.py:44)
  0.00% 100.00%   0.000s    350.4s   invoke (click/core.py:1689)
  0.00% 100.00%   0.000s    350.4s   <module> (<string>:1)
  0.00% 100.00%   0.000s    350.4s   <module> (flatsurvey/worker/__main__.py:217)
  0.00% 100.00%   0.000s    350.4s   produce (flatsurvey/pipeline/processor.py:95)
  0.00%   0.00%   0.000s    337.5s   erase_marked_points (flatsurf/geometry/similarity_surface.py:2417) <-- HERE
  0.00%   0.00%   0.000s    271.8s   0x7f80c0d3401b (?)
  0.00%   0.00%   0.000s    271.8s   flatsurf::FlatTriangulation<eantic::renf_elem_class>::eliminateMarkedPoints (libflatsurf.so.7.10.0)
  0.00%   0.00%   0.030s    270.8s   flatsurf::FlatTriangulation<eantic::renf_elem_class>::operator+ (libflatsurf.so.7.10.0)
  0.00%   0.00%   0.160s    182.7s   eantic::(anonymous namespace)::binop<&renf_elem_mul, &renf_elem_mul_fmpz, &renf_elem_mul_fmpq> (libeanticxx.so.1.0.2)
  0.00%   0.00%   0.050s    182.0s   renf_elem_mul (libeantic.so.1.0.2)
  0.00%   0.00%   0.000s    175.6s   nf_elem_mul_red (mul.c:169)
  0.00%   0.00%   0.000s    173.8s   nf_elem_mul_red (mul.c:185)
  0.00%   0.00%   0.060s    149.6s   flatsurf::detail::VectorExact<flatsurf::Vector<eantic::renf_elem_class>, eantic::renf_elem_class>::ccw (libflatsurf.so.7.10.0)
  0.00%   0.00%   0.010s    127.1s   _nf_elem_mul_red (mul.c:124)
  0.00%   0.00%   0.060s    127.1s   flatsurf::ImplementationOf<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::check (libflatsurf.so.7.10.0)  <-- Since this shows up, maybe we're creating lots of intermediate surfaces.
  0.00%   0.00%   0.010s    126.9s   _fmpz_poly_divrem_divconquer (divrem_divconquer.c:95)
  0.00%   0.00%   0.000s    120.4s   __fmpz_poly_divrem_divconquer (divrem_divconquer.c:23)
  0.00%   0.00%   0.010s    109.7s   __fmpz_poly_divrem_divconquer (divrem_divconquer.c:45)
  0.00%   0.00%   0.000s    85.80s   flatsurf::FlatTriangulation<eantic::renf_elem_class>::FlatTriangulation (libflatsurf.so.7.10.0)
  0.00%   0.00%   0.000s    46.06s   _fmpz_poly_divrem_divconquer_recursive (divrem_divconquer_recursive.c:105)
  0.00%   0.00%   0.000s    44.82s   flatsurf::ImplementationOf<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::flip (libflatsurf.so.7.10.0)
  0.00%   0.00%    8.08s    43.23s   __tls_get_addr (ld-linux-x86-64.so.2)
  0.00%   0.00%   0.060s    42.89s   _fmpz_poly_mul_KS (mul_KS.c:107)
  0.00%   0.00%   0.000s    42.38s   _fmpz_poly_divrem_divconquer_recursive (divrem_divconquer_recursive.c:65)
  0.00%   0.00%   0.000s    40.58s   flatsurf::FlatTriangulation<eantic::renf_elem_class>::clone (libflatsurf.so.7.10.0)
  0.00%   0.00%   0.050s    39.68s   _fmpz_poly_bit_unpack (bit_unpack.c:34)
  0.00%   0.00%   0.090s    34.13s   _fmpz_poly_divrem_divconquer_recursive (divrem_divconquer_recursive.c:79)
  0.00%   0.00%   0.000s    33.72s   _nf_elem_mul_red (mul.c:96)
  0.00%   0.00%   0.010s    33.55s   _fmpz_poly_divrem_divconquer_recursive (divrem_divconquer_recursive.c:119)
  0.00%   0.00%   15.93s    30.94s   update_get_addr (ld-linux-x86-64.so.2)
  0.00%   0.00%   0.000s    28.50s   _fmpz_promote (fmpz.c:208)
  0.00%   0.00%   0.240s    28.50s   _fmpz_promote (fmpz.c:213)
  0.00%   0.00%   0.280s    27.31s   fmpq_poly_evaluate_arb (libeantic.so.1.0.2)
  0.00%   0.00%    1.84s    26.59s   __gmpz_realloc (libgmp.so.10.4.1)
  0.00%   0.00%   0.300s    25.34s   _fmpz_poly_evaluate_arb (libeantic.so.1.0.2)
  0.00%   0.00%    8.50s    24.75s   memory_sage_sig_realloc (sage/ext/memory.cpython-39-x86_64-linux-gnu.so)
  0.00%   0.00%    6.23s    24.62s   _fmpz_clear_mpz (fmpz.c:161)
  0.00%   0.00%   24.17s    24.17s   __gmpn_addmul_1_x86_64 (libgmp.so.10.4.1)
  0.00%   0.00%   0.140s    23.83s   flatsurf::Approximation<eantic::renf_elem_class>::arb (libflatsurf.so.7.10.0)
  0.00%   0.00%   0.300s    23.69s   exactreal::Arb::Arb (libexactreal.so.6.2.0)
  0.00%   0.00%   0.100s    20.91s   flatsurf::detail::VectorBase<flatsurf::Vector<eantic::renf_elem_class> >::operator flatsurf::Vector<exactreal::Arb> (libflatsurf.so.7.10.0)
  0.00%   0.00%   0.020s    20.88s   _fmpz_poly_divrem_divconquer_recursive (divrem_divconquer_recursive.c:32)
  0.00%   0.00%   0.020s    20.27s   _fmpz_poly_mullow_SS (mullow_SS.c:116)
  0.00%   0.00%   0.050s    18.10s   fmpz_bit_unpack (bit_unpack.c:82)
  0.00%   0.00%   0.750s    17.58s   eantic::renf_elem_class::renf_elem_class (libeanticxx.so.1.0.2)
  0.00%   0.00%   0.020s    17.46s   eantic::operator< (libeanticxx.so.1.0.2)
  0.00%   0.00%   0.250s    17.32s   renf_elem_cmp (libeantic.so.1.0.2)
  0.00%   0.00%   12.70s    16.26s   realloc (libc.so.6)
  0.00%   0.00%   0.000s    15.44s   _fmpz_poly_divrem_basecase (divrem_basecase.c:34)
  0.00%   0.00%    3.67s    15.38s   _fmpz_new_mpz (fmpz.c:123)
  0.00%   0.00%   15.01s    15.01s   _dl_update_slotinfo (ld-linux-x86-64.so.2)
  0.00%   0.00%   0.430s    14.03s   _fmpz_poly_relative_condition_number_2exp (libeantic.so.1.0.2)
  0.00%   0.00%   0.130s    13.99s   eantic::renf_elem_class::~renf_elem_class (libeanticxx.so.1.0.2)
  0.00%   0.00%   0.120s    13.76s   renf_elem_set (libeantic.so.1.0.2)
  0.00%   0.00%   0.240s    13.68s   _fmpz_vec_sub (sub.c:21)
  0.00%   0.00%   0.110s    13.41s   renf_elem_clear (libeantic.so.1.0.2)
  0.00%  90.00%   0.000s    13.12s   __Pyx_PyFunction_FastCallDict.constprop.0 (sage/misc/cachefunc.cpython-39-x86_64-linux-gnu.so)
  0.00%  90.00%   0.000s    13.12s   __call__ (sage/misc/cachefunc.cpython-39-x86_64-linux-gnu.so)
  0.00%  90.00%   0.000s    13.12s   task_step_impl (_asynciomodule.c:2669)
  0.00%  90.00%   0.000s    13.12s   task_step (_asynciomodule.c:2969)
  0.00%  90.00%   0.000s    13.12s   __Pyx_PyFunction_FastCallNoKw (sage/misc/cachefunc.cpython-39-x86_64-linux-gnu.so)
  0.00%  90.00%   0.000s    13.12s   __libc_start_main_impl (libc.so.6)
  0.00%  90.00%   0.000s    13.12s   __libc_start_call_main (libc.so.6)
  0.00%   0.00%   0.100s    13.05s   _fmpz_vec_clear (clear.c:23)
  0.00%   0.00%    5.29s    12.86s   __gmpn_toom22_mul (libgmp.so.10.4.1)
  0.00%   0.00%    1.18s    12.76s   __gmpn_mul (libgmp.so.10.4.1)
  0.00%   0.00%   0.310s    12.11s   fmpz_set (set.c:31)
  0.00%   0.00%   0.000s    11.82s   flatsurf::ImplementationOf<flatsurf::Chain<flatsurf::FlatTriangulation<eantic::renf_elem_class> > >::ImplementationOf (libflatsurf.so.7.10.0)
  0.00%   0.00%   0.000s    11.79s   _fmpz_poly_mul_KS (mul_KS.c:95)
  0.00%   0.00%   0.020s    11.78s   spimpl::details::default_copy<flatsurf::ImplementationOf<flatsurf::SaddleConnection<flatsurf::FlatTriangulation<eantic::renf_elem_class> > > >
  0.00%   0.00%   0.000s    11.64s   spimpl::details::default_copy<flatsurf::ImplementationOf<flatsurf::Chain<flatsurf::FlatTriangulation<eantic::renf_elem_class> > > > (libflatsur
  0.00%   0.00%   0.070s    11.55s   _fmpz_vec_scalar_submul_fmpz (scalar_submul_fmpz.c:38)
  0.00%   0.00%   0.000s    11.41s   flatsurf::Deformation<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::operator* (libflatsurf.so.7.10.0)
  0.00%   0.00%   0.200s    11.38s   fmpz_clear (fmpz.h:171)
  0.00%   0.00%   0.000s    11.30s   flatsurf::CompositeDeformationRelation<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::CompositeDeformationRelation (libflatsurf.so.7.1
  0.00%   0.00%   0.390s    11.14s   spimpl::details::default_copy<flatsurf::ImplementationOf<flatsurf::Path<flatsurf::FlatTriangulation<eantic::renf_elem_class> > > > (libflatsurf
  0.00%   0.00%   0.010s    11.14s   flatsurf::ShiftDeformationRelation<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::clone (libflatsurf.so.7.10.0)
  0.00%   0.00%   0.960s    11.02s   fmpq_poly_clear (clear.c:24)
  0.00%   0.00%   0.030s    10.92s   _fmpz_demote_val (fmpz.c:252)
  0.00%   0.00%   0.000s    10.41s   flatsurf::CompositeDeformationRelation<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::clone (libflatsurf.so.7.10.0)
  6.00%   6.00%    7.75s    10.34s   _int_malloc (libc.so.6)
  0.00%   0.00%   0.150s    10.24s   fmpz_bit_unpack (bit_unpack.c:74)
  0.00%   0.00%   0.010s     9.89s   flatsurf::QuadraticPolynomial<eantic::renf_elem_class>::positive (libflatsurf.so.7.10.0)
  0.00%   0.00%    2.27s     9.66s   _fmpz_new_mpz (fmpz.c:68)
  0.00%   0.00%   0.580s     9.46s   _fmpz_add2_fast (mag.h:82)
  0.00%   0.00%    4.67s     9.39s   __gmpn_mul_basecase_fat (libgmp.so.10.4.1)
  0.00%   0.00%   0.130s     9.10s   _fmpz_vec_add (add.c:21)
  0.00%   0.00%   0.000s     9.07s   spimpl::details::default_copy<flatsurf::ImplementationOf<flatsurf::Vector<eantic::renf_elem_class> > > (libflatsurf.so.7.10.0)
  0.00%   5.00%   0.000s     8.19s   CPyCppyy::CPPMethod::ExecuteFast (CPPMethod.cxx:182)
  0.00%   5.00%   0.000s     8.19s   CPyCppyy::CPPMethod::Execute (CPPMethod.cxx:853)
  0.00%   0.00%   0.330s     8.10s   _fmpz_vec_set (set.c:23)
  0.00%   0.00%   0.050s     7.91s   renf_refine_embedding (libeantic.so.1.0.2)
  0.00%   0.00%    2.47s     7.76s   _fmpz_clear_mpz (fmpz.c:167)
  0.00%   0.00%   0.070s     7.67s   flatsurf::ChainVector<flatsurf::FlatTriangulation<eantic::renf_elem_class>, eantic::renf_elem_class>::ChainVector (libflatsurf.so.7.10.0)
  0.00%   0.00%   0.000s     7.64s   arb_mul (mul.c:89)
  0.00%   5.00%   0.000s     7.64s   CPyCppyy::CPPMethod::Call (CPPMethod.cxx:913)
  0.00%   0.00%   0.320s     7.48s   fmpz_add (add.c:55)
  0.00%   0.00%   0.040s     7.48s   fmpz_bit_unpack (bit_unpack.c:141)
  0.00%   0.00%   0.000s     7.39s   spimpl::details::default_delete<flatsurf::ImplementationOf<flatsurf::Deformation<flatsurf::FlatTriangulation<eantic::renf_elem_class> > > > (li
  0.00%   0.00%   0.060s     7.39s   flatsurf::CompositeDeformationRelation<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::~CompositeDeformationRelation (libflatsurf.so.7.
  0.00%   0.00%   0.390s     7.27s   spimpl::details::default_delete<flatsurf::ImplementationOf<flatsurf::Path<flatsurf::FlatTriangulation<eantic::renf_elem_class> > > > (libflatsu
  0.00%   0.00%   0.050s     7.20s   flatsurf::ShiftDeformationRelation<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::~ShiftDeformationRelation (libflatsurf.so.7.10.0)
  0.00%   0.00%   0.010s     7.16s   spimpl::details::default_delete<flatsurf::ImplementationOf<flatsurf::SaddleConnection<flatsurf::FlatTriangulation<eantic::renf_elem_class> > >
  0.00%   0.00%   0.000s     7.09s   flatsurf::ImplementationOf<flatsurf::Chain<flatsurf::FlatTriangulation<eantic::renf_elem_class> > >::~ImplementationOf (libflatsurf.so.7.10.0)
  0.00%   0.00%   0.000s     7.09s   spimpl::details::default_delete<flatsurf::ImplementationOf<flatsurf::Chain<flatsurf::FlatTriangulation<eantic::renf_elem_class> > > > (libflats
  0.00%   0.00%    7.08s     7.08s   __gmpn_add_n_x86_64 (libgmp.so.10.4.1)
  1.00%   1.00%    3.65s     6.88s   __gmpz_add (libgmp.so.10.4.1)
  0.00%   2.00%   0.000s     6.87s   flint_calloc (memory_manager.c:147)
  0.00%   0.00%   0.210s     6.75s   _fmpz_poly_mul_KS (mul_KS.c:81)

Something weird is going on with the worker since on pascaline the worker is still fast for this test case. It's something on my local machine that makes it slow. (No, it's fast on pascaline because it's looking up the result in a cache. It's slow without caching.)