Open saraedum opened 2 years ago
@videlec we have two strange regreessions here.
The scheduler now reports that it does a lot of minpoly
in subfield_from_elements
(this is for a+b+c odd and ≥ 113.)
0.00% 19.00% 0.000s 293.8s <module> (flatsurvey/__main__.py:399)
0.00% 19.00% 0.000s 293.8s invoke (click/core.py:760)
0.00% 19.00% 0.000s 293.8s run (asyncio/runners.py:44)
0.00% 19.00% 0.000s 293.8s __call__ (click/core.py:1130)
0.00% 19.00% 0.000s 293.8s _process_result (click/core.py:1626)
0.00% 19.00% 0.000s 293.8s main (click/core.py:1055)
0.00% 19.00% 0.000s 293.8s _run_module_as_main (runpy.py:197)
0.00% 19.00% 0.000s 293.8s invoke (click/core.py:1689)
0.00% 19.00% 0.000s 293.8s _run_code (runpy.py:87)
0.00% 19.00% 0.000s 293.8s run_forever (asyncio/base_events.py:601)
0.00% 19.00% 0.000s 293.8s run_until_complete (asyncio/base_events.py:634)
0.00% 19.00% 0.000s 293.8s process (flatsurvey/__main__.py:160)
0.00% 19.00% 0.000s 293.8s _run_once (asyncio/base_events.py:1905)
0.00% 19.00% 0.020s 293.8s _run (asyncio/events.py:80)
0.00% 19.00% 0.000s 293.0s start (flatsurvey/__main__.py:224)
0.00% 18.00% 0.190s 292.4s command (flatsurvey/surfaces/surface.py:152)
0.00% 18.00% 0.000s 292.4s _render_command (flatsurvey/__main__.py:309)
0.00% 18.00% 0.000s 292.2s __reduce__ (flatsurvey/surfaces/ngons.py:575)
0.00% 18.00% 0.050s 277.7s _lengths (flatsurvey/surfaces/ngons.py:484)
0.00% 0.00% 0.000s 173.3s __init__ (flatsurf/geometry/polygon.py:2072)
0.00% 0.00% 0.010s 173.2s subfield_from_elements (flatsurf/geometry/subfield.py:153) <-- HERE
0.00% 0.00% 173.0s 173.2s <genexpr> (flatsurf/geometry/subfield.py:153)
0.00% 18.00% 0.030s 105.1s NumberField (sage/rings/number_field/number_field.py:569)
0.00% 18.00% 0.000s 102.9s __init__ (flatsurf/geometry/polygon.py:2052)
0.00% 11.00% 0.000s 98.83s _richcmp_ (sage/rings/qqbar.py:5458)
0.00% 11.00% 0.000s 84.91s sign (sage/rings/qqbar.py:5825)
0.00% 11.00% 0.000s 84.90s minpoly (sage/rings/qqbar.py:4496)
11.00% 11.00% 84.77s 84.90s minpoly (sage/rings/qqbar.py:7769)
0.00% 0.00% 2.83s 14.05s cmp_elements_with_same_minpoly (sage/rings/qqbar.py:2863)
0.00% 0.00% 0.040s 13.92s sign (sage/rings/qqbar.py:5828)
0.00% 5.00% 0.420s 12.02s refine_interval (sage/rings/qqbar.py:7047)
0.00% 0.00% 0.000s 10.68s polynomial_root (sage/rings/qqbar.py:1397)
0.00% 0.00% 0.000s 10.59s __init__ (sage/rings/qqbar.py:6883)
I see lots of kernel activity as if some thrashing was going on. Is minpoly somehow multithreaded and something going terribly wrong here? Maybe, setting MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 SAGE_NUM_THREADS=1
it still spends a lot of time in minpoly but it's much faster overall.
At the same time, all the worker does is erasing marked points (1, 13, 63):
0.00% 100.00% 0.000s 350.4s resolve (flatsurvey/pipeline/consumer.py:170)
0.00% 100.00% 0.000s 350.4s process (flatsurvey/worker/__main__.py:137)
0.00% 100.00% 0.000s 350.4s runctx (profile.py:62)
0.00% 100.00% 0.000s 350.4s start (flatsurvey/worker/__main__.py:208)
0.00% 100.00% 0.000s 350.4s _run (asyncio/events.py:80)
0.00% 100.00% 0.000s 350.4s run_module (runpy.py:213)
0.00% 100.00% 0.000s 350.4s _run_once (asyncio/base_events.py:1905)
0.00% 100.00% 0.000s 350.4s main (click/core.py:1055)
0.00% 100.00% 0.000s 350.4s _run_module_as_main (runpy.py:197)
0.00% 100.00% 0.000s 350.4s invoke (click/core.py:760)
0.00% 100.00% 0.000s 350.4s consume (flatsurvey/pipeline/consumer.py:130)
0.00% 100.00% 0.000s 350.4s runctx (cProfile.py:100)
0.00% 100.00% 0.000s 350.4s <module> (cProfile.py:190)
0.00% 100.00% 0.000s 350.4s produce (flatsurvey/pipeline/producer.py:84)
0.00% 100.00% 0.000s 350.4s orbit_closure (flatsurvey/surfaces/surface.py:74)
0.00% 100.00% 0.000s 350.4s run_until_complete (asyncio/base_events.py:634)
0.00% 100.00% 0.000s 350.4s _consume (flatsurvey/jobs/flow_decomposition.py:120)
0.00% 100.00% 0.000s 350.4s surface (flatsurvey/surfaces/surface.py:119)
0.00% 100.00% 0.000s 350.4s _process_result (click/core.py:1626)
0.00% 100.00% 0.000s 350.4s _run_code (runpy.py:87)
0.00% 100.00% 0.000s 350.4s runctx (cProfile.py:19)
0.00% 100.00% 0.000s 350.4s _notify_consumers (flatsurvey/pipeline/producer.py:121)
0.00% 100.00% 0.000s 350.4s main (cProfile.py:179)
0.00% 100.00% 0.000s 350.4s run_forever (asyncio/base_events.py:601)
0.00% 100.00% 0.000s 350.4s __call__ (click/core.py:1130)
0.00% 100.00% 0.000s 350.4s run (asyncio/runners.py:44)
0.00% 100.00% 0.000s 350.4s invoke (click/core.py:1689)
0.00% 100.00% 0.000s 350.4s <module> (<string>:1)
0.00% 100.00% 0.000s 350.4s <module> (flatsurvey/worker/__main__.py:217)
0.00% 100.00% 0.000s 350.4s produce (flatsurvey/pipeline/processor.py:95)
0.00% 0.00% 0.000s 337.5s erase_marked_points (flatsurf/geometry/similarity_surface.py:2417) <-- HERE
0.00% 0.00% 0.000s 271.8s 0x7f80c0d3401b (?)
0.00% 0.00% 0.000s 271.8s flatsurf::FlatTriangulation<eantic::renf_elem_class>::eliminateMarkedPoints (libflatsurf.so.7.10.0)
0.00% 0.00% 0.030s 270.8s flatsurf::FlatTriangulation<eantic::renf_elem_class>::operator+ (libflatsurf.so.7.10.0)
0.00% 0.00% 0.160s 182.7s eantic::(anonymous namespace)::binop<&renf_elem_mul, &renf_elem_mul_fmpz, &renf_elem_mul_fmpq> (libeanticxx.so.1.0.2)
0.00% 0.00% 0.050s 182.0s renf_elem_mul (libeantic.so.1.0.2)
0.00% 0.00% 0.000s 175.6s nf_elem_mul_red (mul.c:169)
0.00% 0.00% 0.000s 173.8s nf_elem_mul_red (mul.c:185)
0.00% 0.00% 0.060s 149.6s flatsurf::detail::VectorExact<flatsurf::Vector<eantic::renf_elem_class>, eantic::renf_elem_class>::ccw (libflatsurf.so.7.10.0)
0.00% 0.00% 0.010s 127.1s _nf_elem_mul_red (mul.c:124)
0.00% 0.00% 0.060s 127.1s flatsurf::ImplementationOf<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::check (libflatsurf.so.7.10.0) <-- Since this shows up, maybe we're creating lots of intermediate surfaces.
0.00% 0.00% 0.010s 126.9s _fmpz_poly_divrem_divconquer (divrem_divconquer.c:95)
0.00% 0.00% 0.000s 120.4s __fmpz_poly_divrem_divconquer (divrem_divconquer.c:23)
0.00% 0.00% 0.010s 109.7s __fmpz_poly_divrem_divconquer (divrem_divconquer.c:45)
0.00% 0.00% 0.000s 85.80s flatsurf::FlatTriangulation<eantic::renf_elem_class>::FlatTriangulation (libflatsurf.so.7.10.0)
0.00% 0.00% 0.000s 46.06s _fmpz_poly_divrem_divconquer_recursive (divrem_divconquer_recursive.c:105)
0.00% 0.00% 0.000s 44.82s flatsurf::ImplementationOf<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::flip (libflatsurf.so.7.10.0)
0.00% 0.00% 8.08s 43.23s __tls_get_addr (ld-linux-x86-64.so.2)
0.00% 0.00% 0.060s 42.89s _fmpz_poly_mul_KS (mul_KS.c:107)
0.00% 0.00% 0.000s 42.38s _fmpz_poly_divrem_divconquer_recursive (divrem_divconquer_recursive.c:65)
0.00% 0.00% 0.000s 40.58s flatsurf::FlatTriangulation<eantic::renf_elem_class>::clone (libflatsurf.so.7.10.0)
0.00% 0.00% 0.050s 39.68s _fmpz_poly_bit_unpack (bit_unpack.c:34)
0.00% 0.00% 0.090s 34.13s _fmpz_poly_divrem_divconquer_recursive (divrem_divconquer_recursive.c:79)
0.00% 0.00% 0.000s 33.72s _nf_elem_mul_red (mul.c:96)
0.00% 0.00% 0.010s 33.55s _fmpz_poly_divrem_divconquer_recursive (divrem_divconquer_recursive.c:119)
0.00% 0.00% 15.93s 30.94s update_get_addr (ld-linux-x86-64.so.2)
0.00% 0.00% 0.000s 28.50s _fmpz_promote (fmpz.c:208)
0.00% 0.00% 0.240s 28.50s _fmpz_promote (fmpz.c:213)
0.00% 0.00% 0.280s 27.31s fmpq_poly_evaluate_arb (libeantic.so.1.0.2)
0.00% 0.00% 1.84s 26.59s __gmpz_realloc (libgmp.so.10.4.1)
0.00% 0.00% 0.300s 25.34s _fmpz_poly_evaluate_arb (libeantic.so.1.0.2)
0.00% 0.00% 8.50s 24.75s memory_sage_sig_realloc (sage/ext/memory.cpython-39-x86_64-linux-gnu.so)
0.00% 0.00% 6.23s 24.62s _fmpz_clear_mpz (fmpz.c:161)
0.00% 0.00% 24.17s 24.17s __gmpn_addmul_1_x86_64 (libgmp.so.10.4.1)
0.00% 0.00% 0.140s 23.83s flatsurf::Approximation<eantic::renf_elem_class>::arb (libflatsurf.so.7.10.0)
0.00% 0.00% 0.300s 23.69s exactreal::Arb::Arb (libexactreal.so.6.2.0)
0.00% 0.00% 0.100s 20.91s flatsurf::detail::VectorBase<flatsurf::Vector<eantic::renf_elem_class> >::operator flatsurf::Vector<exactreal::Arb> (libflatsurf.so.7.10.0)
0.00% 0.00% 0.020s 20.88s _fmpz_poly_divrem_divconquer_recursive (divrem_divconquer_recursive.c:32)
0.00% 0.00% 0.020s 20.27s _fmpz_poly_mullow_SS (mullow_SS.c:116)
0.00% 0.00% 0.050s 18.10s fmpz_bit_unpack (bit_unpack.c:82)
0.00% 0.00% 0.750s 17.58s eantic::renf_elem_class::renf_elem_class (libeanticxx.so.1.0.2)
0.00% 0.00% 0.020s 17.46s eantic::operator< (libeanticxx.so.1.0.2)
0.00% 0.00% 0.250s 17.32s renf_elem_cmp (libeantic.so.1.0.2)
0.00% 0.00% 12.70s 16.26s realloc (libc.so.6)
0.00% 0.00% 0.000s 15.44s _fmpz_poly_divrem_basecase (divrem_basecase.c:34)
0.00% 0.00% 3.67s 15.38s _fmpz_new_mpz (fmpz.c:123)
0.00% 0.00% 15.01s 15.01s _dl_update_slotinfo (ld-linux-x86-64.so.2)
0.00% 0.00% 0.430s 14.03s _fmpz_poly_relative_condition_number_2exp (libeantic.so.1.0.2)
0.00% 0.00% 0.130s 13.99s eantic::renf_elem_class::~renf_elem_class (libeanticxx.so.1.0.2)
0.00% 0.00% 0.120s 13.76s renf_elem_set (libeantic.so.1.0.2)
0.00% 0.00% 0.240s 13.68s _fmpz_vec_sub (sub.c:21)
0.00% 0.00% 0.110s 13.41s renf_elem_clear (libeantic.so.1.0.2)
0.00% 90.00% 0.000s 13.12s __Pyx_PyFunction_FastCallDict.constprop.0 (sage/misc/cachefunc.cpython-39-x86_64-linux-gnu.so)
0.00% 90.00% 0.000s 13.12s __call__ (sage/misc/cachefunc.cpython-39-x86_64-linux-gnu.so)
0.00% 90.00% 0.000s 13.12s task_step_impl (_asynciomodule.c:2669)
0.00% 90.00% 0.000s 13.12s task_step (_asynciomodule.c:2969)
0.00% 90.00% 0.000s 13.12s __Pyx_PyFunction_FastCallNoKw (sage/misc/cachefunc.cpython-39-x86_64-linux-gnu.so)
0.00% 90.00% 0.000s 13.12s __libc_start_main_impl (libc.so.6)
0.00% 90.00% 0.000s 13.12s __libc_start_call_main (libc.so.6)
0.00% 0.00% 0.100s 13.05s _fmpz_vec_clear (clear.c:23)
0.00% 0.00% 5.29s 12.86s __gmpn_toom22_mul (libgmp.so.10.4.1)
0.00% 0.00% 1.18s 12.76s __gmpn_mul (libgmp.so.10.4.1)
0.00% 0.00% 0.310s 12.11s fmpz_set (set.c:31)
0.00% 0.00% 0.000s 11.82s flatsurf::ImplementationOf<flatsurf::Chain<flatsurf::FlatTriangulation<eantic::renf_elem_class> > >::ImplementationOf (libflatsurf.so.7.10.0)
0.00% 0.00% 0.000s 11.79s _fmpz_poly_mul_KS (mul_KS.c:95)
0.00% 0.00% 0.020s 11.78s spimpl::details::default_copy<flatsurf::ImplementationOf<flatsurf::SaddleConnection<flatsurf::FlatTriangulation<eantic::renf_elem_class> > > >
0.00% 0.00% 0.000s 11.64s spimpl::details::default_copy<flatsurf::ImplementationOf<flatsurf::Chain<flatsurf::FlatTriangulation<eantic::renf_elem_class> > > > (libflatsur
0.00% 0.00% 0.070s 11.55s _fmpz_vec_scalar_submul_fmpz (scalar_submul_fmpz.c:38)
0.00% 0.00% 0.000s 11.41s flatsurf::Deformation<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::operator* (libflatsurf.so.7.10.0)
0.00% 0.00% 0.200s 11.38s fmpz_clear (fmpz.h:171)
0.00% 0.00% 0.000s 11.30s flatsurf::CompositeDeformationRelation<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::CompositeDeformationRelation (libflatsurf.so.7.1
0.00% 0.00% 0.390s 11.14s spimpl::details::default_copy<flatsurf::ImplementationOf<flatsurf::Path<flatsurf::FlatTriangulation<eantic::renf_elem_class> > > > (libflatsurf
0.00% 0.00% 0.010s 11.14s flatsurf::ShiftDeformationRelation<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::clone (libflatsurf.so.7.10.0)
0.00% 0.00% 0.960s 11.02s fmpq_poly_clear (clear.c:24)
0.00% 0.00% 0.030s 10.92s _fmpz_demote_val (fmpz.c:252)
0.00% 0.00% 0.000s 10.41s flatsurf::CompositeDeformationRelation<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::clone (libflatsurf.so.7.10.0)
6.00% 6.00% 7.75s 10.34s _int_malloc (libc.so.6)
0.00% 0.00% 0.150s 10.24s fmpz_bit_unpack (bit_unpack.c:74)
0.00% 0.00% 0.010s 9.89s flatsurf::QuadraticPolynomial<eantic::renf_elem_class>::positive (libflatsurf.so.7.10.0)
0.00% 0.00% 2.27s 9.66s _fmpz_new_mpz (fmpz.c:68)
0.00% 0.00% 0.580s 9.46s _fmpz_add2_fast (mag.h:82)
0.00% 0.00% 4.67s 9.39s __gmpn_mul_basecase_fat (libgmp.so.10.4.1)
0.00% 0.00% 0.130s 9.10s _fmpz_vec_add (add.c:21)
0.00% 0.00% 0.000s 9.07s spimpl::details::default_copy<flatsurf::ImplementationOf<flatsurf::Vector<eantic::renf_elem_class> > > (libflatsurf.so.7.10.0)
0.00% 5.00% 0.000s 8.19s CPyCppyy::CPPMethod::ExecuteFast (CPPMethod.cxx:182)
0.00% 5.00% 0.000s 8.19s CPyCppyy::CPPMethod::Execute (CPPMethod.cxx:853)
0.00% 0.00% 0.330s 8.10s _fmpz_vec_set (set.c:23)
0.00% 0.00% 0.050s 7.91s renf_refine_embedding (libeantic.so.1.0.2)
0.00% 0.00% 2.47s 7.76s _fmpz_clear_mpz (fmpz.c:167)
0.00% 0.00% 0.070s 7.67s flatsurf::ChainVector<flatsurf::FlatTriangulation<eantic::renf_elem_class>, eantic::renf_elem_class>::ChainVector (libflatsurf.so.7.10.0)
0.00% 0.00% 0.000s 7.64s arb_mul (mul.c:89)
0.00% 5.00% 0.000s 7.64s CPyCppyy::CPPMethod::Call (CPPMethod.cxx:913)
0.00% 0.00% 0.320s 7.48s fmpz_add (add.c:55)
0.00% 0.00% 0.040s 7.48s fmpz_bit_unpack (bit_unpack.c:141)
0.00% 0.00% 0.000s 7.39s spimpl::details::default_delete<flatsurf::ImplementationOf<flatsurf::Deformation<flatsurf::FlatTriangulation<eantic::renf_elem_class> > > > (li
0.00% 0.00% 0.060s 7.39s flatsurf::CompositeDeformationRelation<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::~CompositeDeformationRelation (libflatsurf.so.7.
0.00% 0.00% 0.390s 7.27s spimpl::details::default_delete<flatsurf::ImplementationOf<flatsurf::Path<flatsurf::FlatTriangulation<eantic::renf_elem_class> > > > (libflatsu
0.00% 0.00% 0.050s 7.20s flatsurf::ShiftDeformationRelation<flatsurf::FlatTriangulation<eantic::renf_elem_class> >::~ShiftDeformationRelation (libflatsurf.so.7.10.0)
0.00% 0.00% 0.010s 7.16s spimpl::details::default_delete<flatsurf::ImplementationOf<flatsurf::SaddleConnection<flatsurf::FlatTriangulation<eantic::renf_elem_class> > >
0.00% 0.00% 0.000s 7.09s flatsurf::ImplementationOf<flatsurf::Chain<flatsurf::FlatTriangulation<eantic::renf_elem_class> > >::~ImplementationOf (libflatsurf.so.7.10.0)
0.00% 0.00% 0.000s 7.09s spimpl::details::default_delete<flatsurf::ImplementationOf<flatsurf::Chain<flatsurf::FlatTriangulation<eantic::renf_elem_class> > > > (libflats
0.00% 0.00% 7.08s 7.08s __gmpn_add_n_x86_64 (libgmp.so.10.4.1)
1.00% 1.00% 3.65s 6.88s __gmpz_add (libgmp.so.10.4.1)
0.00% 2.00% 0.000s 6.87s flint_calloc (memory_manager.c:147)
0.00% 0.00% 0.210s 6.75s _fmpz_poly_mul_KS (mul_KS.c:81)
Something weird is going on with the worker since on pascaline the worker is still fast for this test case. It's something on my local machine that makes it slow. (No, it's fast on pascaline because it's looking up the result in a cache. It's slow without caching.)
We would like to go much further in the check of Boshernitzan's conjecture (b). Currently, there are several things that are limiting the runtime here. Here's the output of
time py-spy top --native --rate 10 -- python -m flatsurvey.worker ngon -a 9 -a 37 -a 227 boshernitzan-conjecture-orientations boshernitzan-conjecture
We'll track progress on improving the above in this issue.
Improvements for the worker:
Improvements for the scheduler: