wo80 / CSparse.NET

A concise library for solving sparse linear systems with direct methods.
GNU Lesser General Public License v2.1
57 stars 25 forks source link

added progress event #3

Closed epsi1on closed 7 years ago

epsi1on commented 8 years ago

An event added for increment of progress percent. This should not impact performance as event is fired only 100 times.

wo80 commented 8 years ago

Hi Ehsan, adding a progress event is alright. I will have a look at the other factorization classes, once this PR is merged. But you have to be careful when reporting progress:

private static void Test()
{
    int[] values = { 50, 72, 100, 101, 777, 2100, 2107, 5000000, 11119999, 211119999 };

    Console.WriteLine("        Size     Calls 1      Time 1     Calls 2      Time 2");

    foreach (var item in values)
    {
        Progress(item);
    }
}

private static void Progress(int length)
{
    const int TOTAL = 100;

    var s = new System.Diagnostics.Stopwatch();
    int last = 0, j = 0;

    Console.Write("{0,12}", length);

    s.Start();
    for (int i = 0; i < length; i++)
    {
        // WARNING: the multiplication can overflow.
        if ((TOTAL * i) / length != last)
        {
            j++;
            last = (TOTAL * i) / length;
            // ProgressChanged(last, null);
        }
    }
    s.Stop();

    Console.Write("{0,12}", j);
    Console.Write("{0,10}ms", s.ElapsedMilliseconds);

    j = 0;

    // Better way to report progress:
    double step = length / (double)TOTAL;
    double current = 0.0;

    s.Restart();
    for (int i = 0; i < length; i++)
    {
        if (i > current)
        {
            j++;
            current += step;
            // ProgressChanged((int)((current / step) + 0.5), null);
        }
    }
    s.Stop();

    Console.Write("{0,12}", j);
    Console.Write("{0,10}ms", s.ElapsedMilliseconds);
    Console.WriteLine();
}

Please revise your code to use the second method (starting at // Better way to report progress)

epsi1on commented 8 years ago

Hi, Very nice tip. What about using long (int64) instead of int (int32). if we use double, total arithmetic operations would be 3*N times, also some castings. But if we use long, no overflow happens and maybe same performance with double version, but also more clear and brief. I'll edit the code to use long instead of int for the value that can overflow.

epsi1on commented 8 years ago

Hello again, Can you please have a look at this code which uses long instead of int to prevent overflow:

at the line 9 you can change 100L to 100 to see overflow happens


        private static void Progress(int length)
        {
            var sp = System.Diagnostics.Stopwatch.StartNew();

            var percent = 0;

            var step = 1;

            var hundred = 100L;//convert to 100 without L prefix to cause the overflow

            try
            {
                for (var i = 0; i <= length; i += step)
                {
                    if (checked(percent != hundred * i / length))//checked will throw exception if any overflow occurs - only to be sure no overflow occurs
                    {
                        {//clears current line of console - just ignore it :)
                            int currentLineCursor = Console.CursorTop;
                            Console.SetCursorPosition(0, Console.CursorTop);
                            Console.Write(new string(' ', Console.WindowWidth));
                            Console.SetCursorPosition(0, currentLineCursor);
                        }//end console line clear

                        Console.Write(percent = (int)(hundred * i / length));
                        Console.Write("%");
                    }
                }

                sp.Stop();

                Console.WriteLine(", Done: {0}, {1} ms, no overflow occurs", length, sp.ElapsedMilliseconds);
            }
            catch (Exception ex)
            {
                Console.WriteLine(", Failed: {0}, overflow occurs", length);
            }
        }
wo80 commented 8 years ago

For me, the code using double is at least 10x faster (or even 20x when using long). This is because, in the if-check there's only one boolean operation, compared to one boolean and 2 arithmetic operations (and possibly some implicit casts when using long) in your code. Anything that's inside the if-statement doesn't matter, since it's not called very often.

I agree that your code is more clear and brief, but I'll vote for performance!

I've written a short tip on Codeproject: http://www.codeproject.com/Tips/1130088/Reporting-progress-the-right-way

epsi1on commented 8 years ago

mmm, you are right on performance in our example. can you please confirm that this is your suggested method:

        private static void Progress_Double(int length)
        {
            var sp = System.Diagnostics.Stopwatch.StartNew();

            const int Total = 100;

            double step = length/(double) Total;

            double current = 0.0;

            try
            {
                for (var i = 0; i <= length; i++)
                {
                    if (i > current)
                    {
                        current += step;

                        var percent = (int) ((current/step) + 0.5);

                        ClearCurrentConsoleLine();
                        Console.Write(percent);
                        Console.Write("%");
                    }
                }

                sp.Stop();

                Console.WriteLine(", Done: {0}, {1} ms, no overflow occurs", length, sp.ElapsedMilliseconds);
            }
            catch (Exception ex)
            {
                Console.WriteLine(", Failed: {0}, overflow occurs", length);
            }
        }

Also i think performance difference of our code is slower arthimetic operations of long against double in our CPUs. If you add a long division to the code, that happens a 1e9 times, it will cause different from 10/20x to 3x.

Thanks

wo80 commented 8 years ago

Yes. In the end, I think that you won't see a big difference anyways, since the factorization will dominate the runtime.

Once the changes are merged, are you planning to use the Nuget package for BriefFiniteElement.NET?

epsi1on commented 8 years ago

I've made an hybrid solution (regarding both my own solution and your solution) here: https://gist.github.com/epsi1on/35bcdc3c72ac1f0a574ef5051a7cb5e9

We can call it C&E or E&C (Christian And Ehsan) method for processing the progress increment :) That's speed is near your solution and also does covers some misses. can you please have a look? About nuget, not for now but i plan to use CSParse.NET nuget package. Maybe several weeks from now...

Thanks

wo80 commented 8 years ago

Yes, I like the hybrid method.

I was thinking about changing the way the factorizations are created. So there wouldn't be a ProgressChanged event, but the user could supply a custom progress monitor. What do you think?

public class SparseCholesky
{
    public static void SparseCholesky Create(CompressedColumnStorage<double> A, ColumnOrdering order)
    {
    }

    public static void SparseCholesky Create(CompressedColumnStorage<double> A, ColumnOrdering order,
        IProgressMonitor monitor)
    {
    }

    // Make constructor private.
    private SparseCholesky(CompressedColumnStorage<double> A, ColumnOrdering order)
    {
    }
}

public interface IProgressMonitor
{
    void Report(int percent);
    // void ThrowIfCancellationRequested();
}

If .NET 4.5 is used, the IProgressMonitor could simply be a wrapper around Progress<int>. It might also support cancellation. And in the factorization method, instead of

if (ProgressChanged != null)
{
   ProgressChanged(this, new ProgressChangedEventArgs(percent, null));
}

you would just write

if (monitor != null)
{
   monitor.Report(percent);
}
epsi1on commented 8 years ago

I think it is a good pattern. How cancellation is intended to be?

wo80 commented 8 years ago

I think it's best to use a CancellationToken. But that's up to the user's implementation. The library will only provide the interface method, so in the factorization method we'd have

if (monitor != null)
{
   monitor.Report(percent);
   monitor.ThrowIfCancellationRequested();
}

Do you still plan to update the PR or should I commit my changes and we go on from there?

EDIT: actually, I don't think that cancellation should be part of the interface. If the user wants to support cancellation, the Report method could be (mis)used for this. Like

public class Monitor : IProgressMonitor
{
    IProgress<int> progress; // requires .NET 4.5
    CancellationToken ct;

    public Monitor(IProgress<int> progress, CancellationToken ct)
    {
        this.progress = progress;
        this.ct = ct;
    }

    public void Report(int percent)
    {
        ct.ThrowIfCancellationRequested();
        progress.Report(percent);
    }
}
epsi1on commented 8 years ago

I was thinking maybe cancellation token and progress monitor should pass to method instead of being a class-wide field?

wo80 commented 8 years ago

I've committed my latest changes:

We can keep this pull request open for discussion. I'll test the progress reporting when I find some time.

epsi1on commented 8 years ago

I thought we are still in discussion about how to add this feature! You abandoned our E&C method?! I spent time on it :)

wo80 commented 8 years ago

No, E&C is definitly not abandoned! :)

But if you look at the IProgress interface, I decided to take a different approach. The method

void Report(double value);

shall always report values between 0.0 and 1.0, a value of 1.0 meaning that the factorization finished (100%). This way, we are free to change the actual implementation at any time (for example, if we want to report more than 100 progress events), without changing the interface semantics.

wo80 commented 7 years ago

Here's how to use the progress reporting:

using CSparse;
using CSparse.Double;
using CSparse.Double.Factorization;
using CSparse.IO;
using CSparse.Storage;
using System;

static class TestCholeskyProgress
{
    public static void Run()
    {
        Console.WriteLine("Loading matrix ...");

        // http://www.cise.ufl.edu/research/sparse/matrices/Boeing/bcsstk36.html
        var A = MatrixMarketReader.ReadMatrix<double>(@"bcsstk36.mtx");

        Console.WriteLine("Starting test ...");

        if (Test(A)) Console.WriteLine("OK");
    }

    public static bool Test(CompressedColumnStorage<double> A)
    {
        if (A.RowCount != A.ColumnCount) return false;

        // Create test data.
        var x = Vector.Create(A.ColumnCount, 1.0);
        var b = new double[A.RowCount];

        // Compute right hand side vector b.
        A.Multiply(1.0, x, 0.0, b);

        // Apply column ordering to A to reduce fill-in.
        var order = ColumnOrdering.MinimumDegreeAtPlusA;

        // Create Cholesky factorization and report progress.
        var chol = SparseCholesky.Create(A, order, new Progress(
            (p) => Console.Write("\r{0}%   ", (int)(100 * p))
        ));

        // !!! Should we make sure that 1.0 (100%) always gets
        //     reported from the factorization?
        Console.Write("\r{0}%   ", 100);

        // Solve Ax = b (overwrite x).
        chol.Solve(b, x);

        return true;
    }

    /// <summary>
    /// Reports progress using a delegate.
    /// </summary>
    class Progress : IProgress
    {
        Action<double> progress;

        public Progress(Action<double> progress)
        {
            this.progress = progress;
        }

        public void Report(double value)
        {
            progress(value);
        }
    }
}

When doing the factorization from a UI and on a separate thread, you'll have to take care of the synchronization context.

epsi1on commented 7 years ago

OK, so lets close the pull request...