sympy / sympy_benchmarks

Some benchmarks of SymPy
14 stars 32 forks source link

Add benchmarks for subresultants PRS method #94

Closed 1e9abhi1e10 closed 1 year ago

1e9abhi1e10 commented 1 year ago

This PR adds the benchmarks for the subresultants PRS method. Related Issue: https://github.com/sympy/sympy_benchmarks/issues/88

See https://github.com/sympy/sympy/pull/25371

1e9abhi1e10 commented 1 year ago

The benchmarks for subresultants are failing due to the teardown function. I am working on it!

1e9abhi1e10 commented 1 year ago

This is the current output looks like

[ 42.32%] ··· ...UBRESULTANTS_LinearDenseQuadraticGCD.time_op                 ok
[ 42.32%] ··· ====== ============= ============= =============
              --                        impl                  
              ------ -----------------------------------------
               size       expr         dense         sparse   
              ====== ============= ============= =============
                1     2.07±0.01ms     541±2μs       1.42±0ms  
                2     4.62±0.01ms   3.17±0.01ms     7.01±0ms  
                3       9.93±0ms    18.1±0.04ms   31.2±0.04ms 
              ====== ============= ============= =============

[ 42.48%] ··· ...meSUBRESULTANTS_QuadraticNonMonicGCD.time_op                 ok
[ 42.48%] ··· ====== ============= ============= =============
              --                        impl                  
              ------ -----------------------------------------
               size       expr         dense         sparse   
              ====== ============= ============= =============
                1     1.24±0.01ms     1.21±0ms    7.82±0.01ms 
                2       3.78±0ms    7.28±0.01ms   18.0±0.04ms 
                3     18.8±0.05ms    157±0.2ms     248±0.3ms  
              ====== ============= ============= =============

[ 42.65%] ··· ...imeSUBRESULTANTS_SparseGCDHighDegree.time_op                 ok
[ 42.65%] ··· ====== ========== ============= =============
              --                      impl                 
              ------ --------------------------------------
               size     expr        dense         sparse   
              ====== ========== ============= =============
                1     970±4μs     253±0.4μs     497±0.9μs  
                2     1.32±0ms     1.42±0ms    2.03±0.01ms 
                3     1.84±0ms   6.59±0.02ms   7.81±0.01ms 
                5     3.20±0ms   30.8±0.02ms    34.1±0.1ms 
              ====== ========== ============= =============

[ 42.81%] ··· ...UBRESULTANTS_SparseNonMonicQuadratic.time_op                 ok
[ 42.81%] ··· ====== ========= ============= =============
              --                      impl                
              ------ -------------------------------------
               size     expr       dense         sparse   
              ====== ========= ============= =============
                1     632±1μs     255±1μs      673±0.6μs  
                2     701±2μs    598±0.9μs      1.11±0ms  
                3     759±1μs   7.33±0.02ms   8.05±0.02ms 
                5     885±2μs   20.3±0.05ms   21.2±0.02ms 
              ====== ========= ============= =============
oscarbenjamin commented 1 year ago

I think that this looks good but I think that we should reduce the total number of benchmarks that are reported for both PREM and subresultants.

You can see what it currently looks like when benchmark results are reported to a PR here: https://github.com/sympy/sympy/pull/25371#issuecomment-1636901713

This PR would double that. Now that we can see these examples and how the benchmarks compare the question is: which of these different benchmarks is potentially reporting something interesting?

For GCD probably all cases are interesting (they were designed for GCD) but for PREM it does not seem that e.g. the TimePREM_SparseNonMonicQuadratic timings demonstrate something that is particularly informative compared to what the TimePREM_SparseGCDHighDegree timings show.

oscarbenjamin commented 1 year ago

As an aside the timings here comparing expr, dense and sparse demonstrate a common problem with the sympy benchmark suite. The expr timings suggest that it is faster than both dense and sparse in most cases (up to 20x faster for larger problems). This is almost certainly not true though: under the hood the expr version of e.g. prem uses either the dense or sparse implementation so it is necessarily slower.

The reason for this is that many SymPy operations (with expr) use a cache and the benchmarks report timings from repeating an operation many times while the cache is in operation. It means that the timings reported for expr are often only really representative of the performance of the cache rather than the actual time that it takes to compute something that is not already cached.

1e9abhi1e10 commented 1 year ago

I think that this looks good but I think that we should reduce the total number of benchmarks that are reported for both PREM and subresultants.

subresultants alternatively use prem , so I don't think that we should keep prem benchmarks It's good for us to keep the benchmarks of subresultants and gcd.

1e9abhi1e10 commented 1 year ago

As an aside the timings here comparing expr, dense and sparse demonstrate a common problem with the sympy benchmark suite. The expr timings suggest that it is faster than both dense and sparse in most cases (up to 20x faster for larger problems). This is almost certainly not true though: under the hood the expr version of e.g. prem uses either the dense or sparse implementation so it is necessarily slower.

The reason for this is that many SymPy operations (with expr) use a cache and the benchmarks report timings from repeating an operation many times while the cache is in operation. It means that the timings reported for expr are often only really representative of the performance of the cache rather than the actual time that it takes to compute something that is not already cached.

Should we remove the expr or is there any method so that we can able to calculate the actual timing of expr?

oscarbenjamin commented 1 year ago

Should we remove the expr or is there any method so that we can able to calculate the actual timing of expr?

No, we can leave it there. We just need to recognise that those benchmarks are not always representative of any actual improvement or regression in performance.

oscarbenjamin commented 1 year ago

subresultants alternatively use prem , so I don't think that we should keep prem benchmarks It's good for us to keep the benchmarks of subresultants and gcd.

I would like to keep something for PREM because perhaps in future it would not be used by subresultants for example. I just don't think we need so many benchmarks for PREM. Also if the different cases are not particularly interesting for revealing different aspects of the performance of subresultants then we don't need so many cases for that either.

1e9abhi1e10 commented 1 year ago

subresultants alternatively use prem , so I don't think that we should keep prem benchmarks It's good for us to keep the benchmarks of subresultants and gcd.

I would like to keep something for PREM because perhaps in future it would not be used by subresultants for example. I just don't think we need so many benchmarks for PREM. Also if the different cases are not particularly interesting for revealing different aspects of the performance of subresultants then we don't need so many cases for that either.

We should keep TimePREM_LinearDenseQuadraticGCD and TimePREM_QuadraticNonMonicGCD for PREM and TimeSUBRESULTANTS_LinearDenseQuadraticGCD , TimeSUBRESULTANTS_QuadraticNonMonicGCD and TimeSUBRESULTANTS_SparseNonMonicQuadratic for subresultants ?

oscarbenjamin commented 1 year ago

We should keep TimePREM_LinearDenseQuadraticGCD and TimePREM_QuadraticNonMonicGCD for PREM and TimeSUBRESULTANTS_LinearDenseQuadraticGCD , TimeSUBRESULTANTS_QuadraticNonMonicGCD and TimeSUBRESULTANTS_SparseNonMonicQuadratic for subresultants ?

Why in particular those benchmarks?

Or is that just a random choice?

Do TimePREM_LinearDenseQuadraticGCD and TimePREM_QuadraticNonMonicGCD demonstrate different aspects of the performance of prem somehow?

Are the other cases less relevant than those two?

1e9abhi1e10 commented 1 year ago

For TimePREM_SparseGCDHighDegree the polynomials have a high degree of GCD sparse. While a degree is high, sparsity might make the pseudo remainder method less efficient compared to other approaches like other modular methods.

1e9abhi1e10 commented 1 year ago

We can also reduce the values of size or n params = [(1, 3, 5, 8), ('expr', 'dense', 'sparse'), maybe the two values will be enough.

oscarbenjamin commented 1 year ago

For TimePREM_SparseGCDHighDegree the polynomials have a high degree of GCD sparse. While a degree is high, sparsity might make the pseudo remainder method less efficient compared to other approaches like other modular methods.

Okay, well that seems like a good reason. Maybe some docstrings should be added to the benchmark cases explaining why they are chosen. Also it would be good to add docstrings to the examples types like _LinearDenseQuadraticGCD so that we can have some idea when looking at the code what the polynomials they generate will look like.

I put some docstrings in the suggested code at https://github.com/sympy/sympy_benchmarks/pull/89#issuecomment-1632857681 as an illustration of how that might look. Perhaps it would just be good to pick a particular n and show what f, g and d would be for that case explaining how n relates to the generated polynomials.

1e9abhi1e10 commented 1 year ago

@oscarbenjamin Please have a look!

oscarbenjamin commented 1 year ago

Okay, looks good.

1e9abhi1e10 commented 1 year ago

thanks!