JanKallman / EPPlus

Create advanced Excel spreadsheets using .NET
3.76k stars 1.18k forks source link

Asynchronous methods #283

Open virzak opened 6 years ago

virzak commented 6 years ago

Any plans to introduce async methods to ExcelPackage such as:

public async Task SaveAsync();

Thanks

swmal commented 6 years ago

Introduced async methods for Calculate, Save, SaveAs and Load.

virzak commented 6 years ago

@swmal

Synchronous calls shouldn't be wrapped with Task.Run for Async implementation. Instead, they should take CancellationToken as optional parameter and pass it down to .net Async methods.

Please reopen the issue.

swmal commented 6 years ago

I understand what you mean. But still feel a bit uncertain what to do with this since there are - as far as I can see - no calls to .net async methods other than reading/writing individual bytes from/to streams in the Load/Save/SaveAs methods. And nothing in the Calculate methods. Of course you could interrupt the execution of the methods with the cancellation token, but that would be a much more extensive functionality to implement and would also leave the workbook in a broken state. I figured that the main purpose here was to allow the caller to do other things in parallell while these functions are executed.

What do you suggest?

virzak commented 6 years ago

If there is nothing in the Calculate method, it should not have an async version. For any I/O read/write you need to almost rewrite the Load/Save/SaveAs.

One example is File.ReadAllBytes(Async)

Synchronous: https://github.com/dotnet/corefx/blob/a59c79f20294c7e6d71308f0314060ca7b8da7be/src/System.IO.FileSystem/src/System/IO/File.cs#L327

Asynchronous: https://github.com/dotnet/corefx/blob/a59c79f20294c7e6d71308f0314060ca7b8da7be/src/System.IO.FileSystem/src/System/IO/File.cs#L756

Here is an article with implementation example: https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/async/using-async-for-file-access

Hopefully this provides some direction.

swmal commented 6 years ago

Thanks. I discussed this with @JanKallman yesterday and we decided to remove the methods I added. We might look deeper this in the future. Will keep it open as a feature request for now.

virzak commented 6 years ago

@swmal , that makes sense. Thanks.

twest820 commented 6 years ago

If this gets a deeper look, I'd like to suggest consideration of exposing operation progress. For example, both initial access to ExcelPackage.Workbook.Worksheets and calls to ExcelPackage.Save() are potentially long running operations, particularly with worksheets having tens to hundreds of thousands of rows. In its current form, callers of EPPlus have no straightforward way of providing user feedback (such as a progress bar) as these operations progress, which makes for a somewhat awkward user experience.

In some cases it's possible to mitigate this by collecting timing data from previous operations on a given machine, estimating the duration of the current operation, and inferring the amount of progress to show at a given status update. Such approaches are, however, prone to fragility.

virzak commented 6 years ago

Totally agree with @twest820 . Good progress reporting mechanism is IProgress. I suggest that IProgress or IProgress is used that runs from 0 to 1. This is to accommodate parent tasks using the reporting of this library to calculate overall progress.

twest820 commented 6 years ago

We're IProgress based, though the nature of C# generics potentially makes use of the interface complex as callers may need to derive whatever base T might be provided with EPPlus. My current progress T for EPPlus calls is multistage and therefore has flags indicating which state on the T to report to the user. This aligns more with a HasA relationship with any progress state EPPlus might provide rather than IsA.

If feasible, it would be helpful to have the progress range run from 1 to the number of rows in the worksheet. From the standpoint of updating percentage on a progress bar there's no significant difference between this and indicating progress on [0, 1]. However, if the user interface can include a message such as "Reading row {n} of {total}..." this helps set user expectations as to the amount of work being done. For example, "Reading row 200 of 10.000..." and "Reading row 20.000 of 1.000.000..." are both 2% complete. But one operation is going to take rather longer than the other.

virzak commented 5 years ago

Any news on this item?

onionhammer commented 5 years ago

This package officially wont work with an ASPNET Core 3.0 application by default;

"Synchronous operations are disallowed. Call WriteAsync or set AllowSynchronousIO to true instead"