dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.06k stars 9.9k forks source link

Blazor component state is not preserved when handling errors #38757

Closed bachratyg closed 2 years ago

bachratyg commented 2 years ago

Describe the bug

Component state is lost when handling an error via ErrorBoundaryBase even though the error is recoverable

To Reproduce

Based on the default Blazor WebAssembly App template add/update/replace the following:

GlobalErrorHandler.razor

@inherits ErrorBoundaryBase
@ChildContent
@code{
    protected override Task OnErrorAsync(Exception exception)
    {
        // Open snackbar/etc
        this.Recover();
        return Task.CompletedTask;
    }
}

App.razor

<GlobalErrorHandler>
    <Router><!--omitted for brevity--></Router>
</GlobalErrorHandler>

Index.razor

@page "/"
@using System.ComponentModel.DataAnnotations
<EditForm Model="@Model" OnValidSubmit="@Submit">
    <DataAnnotationsValidator />
    <InputText @bind-Value="Model.UserName" />
    <InputText @bind-Value="Model.Password" />
    <button type="submit">Submit</button>
</EditForm>
@code {
    public LoginModel Model { get; } = new();
    public class LoginModel
    {
        [Required] public string UserName { get; set; } = default!;
        [Required] public string Password { get; set; } = default!;
    }

    public Task Submit(EditContext context)
    {
        throw new Exception("Some failed http call");
    }
}

Type some values, press submit, observe that all typed values are lost: the component is reset. This means everything has to be fetched from the backend or retyped by the user even if the cause was just a typo in the username. The alternative would be to wrap every such callsite in a try-catch throughout the entire application which is ridiculously inconvenient.

Exceptions (if any)

Only the one thrown on purpose

Further technical details

dotnet --info Output ``` .NET SDK (reflecting any global.json): Version: 6.0.100 Commit: 9e8b04bbff Runtime Environment: OS Name: Windows OS Version: 10.0.19043 OS Platform: Windows RID: win10-x64 Base Path: C:\Program Files\dotnet\sdk\6.0.100\ Host (useful for support): Version: 6.0.0 Commit: 4822e3c3aa .NET SDKs installed: 6.0.100 [C:\Program Files\dotnet\sdk] .NET runtimes installed: Microsoft.AspNetCore.App 5.0.12 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 6.0.0 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.NETCore.App 5.0.12 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 6.0.0 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.WindowsDesktop.App 5.0.12 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App] Microsoft.WindowsDesktop.App 6.0.0 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App] ```
mkArtakMSFT commented 2 years ago

Thanks for contacting us. @SteveSandersonMS can you please look into this? Thanks!

ghost commented 2 years ago

Thanks for contacting us. We're moving this issue to the .NET 7 Planning milestone for future evaluation / consideration. Because it's not immediately obvious that this is a bug in our framework, we would like to keep this around to collect more feedback, which can later help us determine the impact of it. We will re-evaluate this issue, during our next planning meeting(s). If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

bachratyg commented 2 years ago

Penny for your thoughts? Is there a pattern here that I'm missing? Our adoption of Blazor is blocked on this.

This seems to be the line I believe that causes the component to reset https://github.com/dotnet/aspnetcore/blob/f08285d0b6918fbb2b485d97f4e411dc9ea9a94f/src/Components/Components/src/RenderTree/Renderer.cs#L956 Why is it critical to dispose (and recreate) the subtree? Would it be viable if the error handler could opt-out?

bachratyg commented 2 years ago

@SteveSandersonMS anything I can do to help move this forward? This is one of the older untriaged issues in the .net 7 planning milestone.

SteveSandersonMS commented 2 years ago

When we designed Recover, the intended semantics are a bit different from what I think you're hoping for. It's meant to mean "I want to rebuild the failed part of the UI using some fresh and healthy new instances, rather than leave it in a dead state". It's not meant to mean "the exception that happened can just be ignored, even though the subtree is in an indeterminate state and might be corrupted in arbitrary ways".

This is because unhanded exceptions could come from anything, such as some code during rendering. Continuing to use those instances would be a serious risk to the stability and security of your app, because it's nearly impossible to reason about what state they are now in.

The purpose of ErrorBoundary is to provide a sensible UX around reporting and recovery from unanticipated errors.

The alternative would be to wrap every such callsite in a try-catch throughout the entire application which is ridiculously inconvenient.

I'm afraid that catching exceptions, instead of letting them hit generic global handlers, is the right way to distinguish between problems you can anticipate and handle as non-errors, and other problems you didn't anticipate and therefore can't know for sure whether the original flow is still in a valid state. The alternative is an On Error Resume Next-style pattern.

Hopefully, putting a try/catch around things you know can fail (e.g., HTTP requests), or even using different APIs that don't throw (e.g., HttpClient's SendAsync), isn't too onerous and gives you the opportunity to make reasonable choices about how a given bit of UI should present the error states and what options users should have (e.g., to retry?).

Finally, we could add some overload like Recover(dangerouslyPreserveSubtree: true), but we'd also have to document that you're advised not to use it, because there's no way to say what will happen, and maybe your app/server will just crash depending on how your logic is structured or where the exception originated.

bachratyg commented 2 years ago

@SteveSandersonMS sorry, this got longer than I originally intended to.

When we designed Recover, the intended semantics are a bit different from what I think you're hoping for. It's meant to mean "I want to rebuild the failed part of the UI using some fresh and healthy new instances, rather than leave it in a dead state". It's not meant to mean "the exception that happened can just be ignored, even though the subtree is in an indeterminate state and might be corrupted in arbitrary ways"

TLDR: I believe if it cannot be determined from the exception that the subtree is in a determinate, consistent state then it also cannot be determined that it will be after you rebuild with new supposedly healthy instances and vice versa. Therefore the Recover feature in its current form is either too strict or inherently unsafe.

Comparing with the design document here: https://github.com/dotnet/aspnetcore/issues/30940#issuecomment-799504177

This is because unhanded exceptions could come from anything, such as some code during rendering. Continuing to use those instances would be a serious risk to the stability and security of your app, because it's nearly impossible to reason about what state they are now in.

yet the d-doc states that

In specific places where the framework calls user code, we should add a try/catch around the call into user code. And only around the single line of that call - not around any other framework code. This [...] doesn't risk leaving any framework state in an incomplete place.

The d-doc reasons that framework state should be just fine where the exception is caught (or else user code should not be able to intercept it). I can reason just as well about user state. I do recognize the business exception thrown (e.g. a server validation error with a known structure) - after all it's my code. I do know the circumstances where it is thrown (a network call) - it's still my code. I do know wherever it is thrown the app is supposed to be in a consistent state - after all the user has to stare at something while waiting for the network call. It may be stale data behind a spinner but stale is neither inconsistent nor generally avoidable.

I'm afraid that catching exceptions, instead of letting them hit generic global handlers, is the right way to distinguish between problems you can anticipate and handle as non-errors, and other problems you didn't anticipate and therefore can't know for sure whether the original flow is still in a valid state.

If you are saying the error boundary should only be used for unanticipated errors that means the Recover() call is always and invariably unsafe. If I don't recognize the exception then I cannot determine where it came from: would the initial component state (OnInitialized, OnParametersSet, etc) trigger it or not? It would be just as dangerous to recover with or without subtree reset. E.g. in the following example the failure is triggered on first render:

<!-- Use the blindly-handle-all GlobalErrorHandler.razor and App.razor as in the OP -->
<!-- Modify this line in Counter.razor to throw a division by zero, lame example for sake of simplicity -->
<p role="status">Current count: @(currentCount / 0)</p>

Whenever you navigate to the counter page this will put the app in a tight infinite loop despite the child component being reset over and over. This could start hammering the server or even cause serious business harm at worst if there were also some non-idempotent operation involved.

The d-doc also says

This ensures it's equivalent to the developer putting a try/catch around their own code

That's exactly what I'm looking for (at least for event handler and lifecycle method entry points), yet you're suggesting

Hopefully, putting a try/catch around things you know can fail (e.g., HTTP requests) [...] isn't too onerous

Unfortunately it is. Considering it's a data driven application it's about half of all methods, I'm in the exact position the d-doc describes as

the primary drawback to this pattern is that it's ridiculously inconvenient to put try/catch around every single method

To add salt to the wound it's not every single method and sometimes it's even a contradictory requirement. A method could be both an event handler entry point and part of some other multi-step process. E.g. consider the following flows:

<button @onclick="Reload">Reload all</button>
<button @onclick="Save">Save me</button>
@code {
public async Task Reload()
{
    _loadedItems = await _backend.GetStuff(...);
} // don't let business exceptions escape here
public async Task Save()
{
    var id = await _backend.PostSave(...);
    await this.Reload();
    _selectedItem = _loadedItems.Find(id); // don't let this run if Reload fails, but that needs an exception that stops the flow
}
}

In this case I would have to add additional boilerplate by dissing structured exception handling in favor of return codes or creating an entrypoint-only wrapper over Reload.

The alternative is an On Error Resume Next-style pattern.

As far as I know there is no such thing as OERN in either .net or js land, I have no intention of implementing something similar due to obvious drawbacks, yet...

or even using different APIs that don't throw (e.g., HttpClient's SendAsync), isn't too onerous

...this sounds like exactly implementing something like it. Execute a piece of code, don't forget to check the error code on the next line, then... find a way to signal failure up the call chain without business exceptions in case this is not an event entry point.

Some other approaches I also thought about:

  1. Putting a wrapper around each event handler:
<button @onclick="Global.HandleBusinessErrors(this.MyActualEventHandler)" />

This is marginally better than having to try/catch-wrap everything, but still does not address lifecycle methods (OnInitialized, OnParametersSet).

  1. Creating a shared base component that intercepts events and deriving everything else from that.

This approach is somewhat similar to the Controller.OnActionExecuted method in MVC.

public abstract class ComponentBaseEx : ComponentBase, IHandleEvent
{
    protected virtual Task UseMeInsteadOnInitializedAsync() => base.OnInitializedAsync();
    protected sealed override async Task OnInitializedAsync()
    {
        try
        {
            await this.UseMeInsteadOnInitializedAsync();
        }
        catch...//handle business exceptions
    }
    Task IHandleEvent.HandleEventAsync(EventCallbackWorkItem callback, object arg)
    {
        try
        {
            // No way to call the explicit interface implementation of the base class! Have to reimplement it all!
            // Making this protected virtual would make it somewhat easier.
        }
        catch...//handle business exceptions
    }
}

One line of code per component plus the warm and fuzzy feeling that every developer has to use my api instead of the platform api.

  1. Add a http interceptor that passes the server error to the global handler and hangs the task indefinitely.

After all there's already an interceptor that converts responses to business exceptions. I'm concerned this would lead to memory leaks though. It would also make it impossible to handle some exceptions while propagating some up to the global handler.

Finally, we could add some overload like Recover(dangerouslyPreserveSubtree: true), but we'd also have to document that you're advised not to use it, because there's no way to say what will happen, and maybe your app/server will just crash depending on how your logic is structured or where the exception originated.

This would be highly appreciated. My logic, my exception type, I structure the code, I control the origin. As far as I see Recover(false) is no safer than Recover(true).

If by crash the server you mean break the circuit and drop the client, I'm all for it. If you mean it brings down the entire server process or crosses some security boundary then I'm really curious how that could happen and why not with Recover(false).

ghost commented 2 years ago

This issue has been resolved and has not had any activity for 1 day. It will be closed for housekeeping purposes.

See our Issue Management Policies for more information.