Oscar seems to spend a lot of time comparing monomial orderings for lookups in dictionaries

HechtiDerLachs commented 1 year ago

I had already mentioned this several times, but could never really reproduce the problem. Let me make another attempt here with the following test file:

using ProfileView
X = projective_space(QQ, 3)
E = direct_sum([twisting_sheaf(X, 0), twisting_sheaf(X, -2)])
Y = covered_scheme(projectivization(E))
C = oscar.simplified_covering(Y)
U = patches(C)
GG = glueings(C)
# trigger the computation once to reduce compilation overhead
underlying_glueing(GG[(U[1],U[end])])
for i in 1:length(U)
  for j in 1:length(U)
    @show i, j
    @profview underlying_glueing(GG[(U[i],U[j])])
  end
end

This produces some (actually a lot) of flame graphs. Most of them should exhibit the following problem: For every ideal membership test in polynomial rings, a cached groebner basis is looked up in a dictionary. The keys for this dictionary are MonomialOrderings and these are compared using the == operator. To this end, it seems that a canonical_matrix is computed and then the equality check is performed on these matrices.

For some of my samples this uses up more than 50% of the actual computing time. Since I don't know how to send the interactive flame graphs, I have made a series of screenshots to illustrate what I mean. I will send this as a .zip on the oscar slack.

HechtiDerLachs commented 1 year ago

I tried a dirty hack by modifying the ideal_membership routine in src/Rings/mpoly-ideals.jl to the following:

function ideal_membership(f::T, I::MPolyIdeal{T}; ordering::MonomialOrdering = default_ordering(base_ring(I))) where T
  if !isempty(keys(I.gb))
    GI = first(values(I.gb))
    singular_assure(GI)
    Sx = base_ring(GI.S)
    return Singular.iszero(Singular.reduce(Sx(f), GI.S))
  end
  GI = standard_basis(I, ordering=ordering, complete_reduction=false)
  singular_assure(GI)
  Sx = base_ring(GI.S)
  return Singular.iszero(Singular.reduce(Sx(f), GI.S))
end

With the modified script from above


X = projective_space(QQ, 3)
E = direct_sum([twisting_sheaf(X, 0), twisting_sheaf(X, -2)])
Y = covered_scheme(projectivization(E))
C = oscar.simplified_covering(Y)
U = patches(C)
GG = glueings(C)
underlying_glueing(GG[(U[1],U[end])])
@time begin
for i in 1:length(U)
  for j in 1:length(U)
    #@show i, j
    #@profview underlying_glueing(GG[(U[i],U[j])])
    underlying_glueing(GG[(U[i],U[j])])
  end
end
end

I get the following timings:

julia> include("/tmp/test.jl")
  1.622100 seconds (5.68 M allocations: 271.927 MiB, 10.56% gc time, 13.77% compilation time)

julia> include("/tmp/test.jl")
  1.690695 seconds (5.56 M allocations: 264.566 MiB, 14.82% gc time)

Compare this to the same script without the hack:

julia> include("/tmp/test.jl")
  5.883227 seconds (20.68 M allocations: 731.047 MiB, 12.67% gc time)

julia> include("/tmp/test.jl")
  4.881775 seconds (20.66 M allocations: 730.684 MiB, 8.76% gc time)

julia> include("/tmp/test.jl")
  7.201743 seconds (20.70 M allocations: 731.302 MiB, 13.69% gc time)

fieker commented 1 year ago

On Thu, Jun 29, 2023 at 02:02:47AM -0700, Matthias Zach wrote:

I had already mentioned this several times, but could never really reproduce the problem. Let me make another attempt here with the following test file:
using ProfileView
X = projective_space(QQ, 3)
E = direct_sum([twisting_sheaf(X, 0), twisting_sheaf(X, -2)])
Y = covered_scheme(projectivization(E))
C = oscar.simplified_covering(Y)
U = patches(C)
GG = glueings(C)
# trigger the computation once to reduce compilation overhead
underlying_glueing(GG[(U[1],U[end])])
for i in 1:length(U)
  for j in 1:length(U)
    @show i, j
    @profview underlying_glueing(GG[(U[i],U[j])])
  end
end
This produces some (actually a lot) of flame graphs. Most of them should exhibit the following problem: For every ideal membership test in polynomial rings, a cached groebner basis is looked up in a dictionary. The keys for this dictionary are MonomialOrderings and these are compared using the == operator. To this end, it seems that a canonical_matrix is computed and then the equality check is performed on these matrices.

Didn't we discuss this some short time ago? The solution is to cache the matrices somewhere - which I thought was the resolution we came to the last time?

For some of my samples this uses up more than 50% of the actual computing time. Since I don't know how to send the interactive flame graphs, I have made a series of screenshots to illustrate what I mean. I will send this as a .zip on the oscar slack.

-- Reply to this email directly or view it on GitHub: https://github.com/oscar-system/Oscar.jl/issues/2498 You are receiving this because you are subscribed to this thread.

Message ID: @.***>

HechtiDerLachs commented 1 year ago

Yes, I also think this has been discussed somewhere. But it has not been resolved yet, as it seems. Sorry if this turns out to be a duplicate! But the benchmarks might nevertheless be helpful.

HechtiDerLachs commented 1 year ago

Screenshot from 2023-06-29 10-40-13 This is one screenshot. As I said, these make up more than 50% of the tips of the graph.

lgoettgens commented 1 year ago

Just caching the canonical_matrix should be fairly easy to do. And this has nearly the same effect as @HechtiDerLachs's change. With the example script from above

julia> @time include("/tmp/test.jl")
  3.443020 seconds (23.87 M allocations: 840.954 MiB, 22.27% gc time)

julia> @time include("/tmp/test.jl")
  4.127304 seconds (23.87 M allocations: 840.932 MiB, 22.27% gc time)

to

julia> @time include("/tmp/test.jl")
  1.502957 seconds (9.74 M allocations: 410.034 MiB, 20.06% gc time, 0.99% compilation time)

julia> @time include("/tmp/test.jl")
  1.486317 seconds (9.74 M allocations: 410.014 MiB, 18.78% gc time, 0.65% compilation time)

on my machine.

fingolfin commented 1 year ago

The discussion was in #2099 which was unfortunately closed because the initial issue it was about was resolved (I believe? didn't verify). But the additional observations and suggestions for how to improve things were not addressed.

Of course it is not enough to suggest a way to fix things, we also need someone to do it. Since this affects the commutative algebra code, the natural people to work on it from my POV would be you, @HechtiDerLachs or @hannes14 or @ederc or @JohnAAbbott ... We could discuss it on our next Wednesday meeting (if you can't come physically, I'd be happy to try and use a camera+projector to add you in remotely, @HechtiDerLachs ).

CC @wdecker @afkafkafk13

HechtiDerLachs commented 1 year ago

Thanks for reminding us about where this discussion already took place. Personally, I was unaware of #2133.

fingolfin commented 1 year ago

So there is a meeting by the geometry group tomorrow (Thursday) where Janko will bring this up to figure out who should work on this.

jankoboehm commented 1 year ago

As of discussion at Singular coding sprint, should be resolved to a sufficient level by PR #2535 and PR #2499.

HechtiDerLachs commented 7 months ago

I have experienced some regression on this issue today, but I don't have a minimal example which would be presentable here, yet. I will probably reopen this issue once I have one.

fingolfin commented 6 months ago

As far as I know the original issue is resolved, and even the newer one @HechtiDerLachs was talking about. But even if not: please open a new issue.

Closing this one for now.

oscar-system / Oscar.jl

Oscar seems to spend a lot of time comparing monomial orderings for lookups in dictionaries #2498