MicrosoftEdge / WebView2Feedback

Feedback and discussions about Microsoft Edge WebView2
https://aka.ms/webview2
452 stars 55 forks source link

[Problem/Bug]: On some windows user accounts, setting parent of webview in some contexts results in - Exception: System.InvalidOperationException: CoreWebView2 members cannot be accessed after the WebView2 control is disposed. ---> System.Runtime.InteropServices.COMException: The group or resource is not in the correct state to perform the requested operation. (Exception from HRESULT: 0x8007139F) #4206

Closed benhbell closed 7 months ago

benhbell commented 11 months ago

What happened?

Hello. We have a Windows PPT Plugin that our customers use. We use the latest Webview2SDK. A subset of our users are encountering an issue where some of our webviews are not displaying.

They all have a consistent error that does not seem to match what we can see in the code. Only one of our dev computers and only one user account can replicate it.

The issue can be replicated on the same machine, with the same software as another working user account and we have tried many non-code things to resolve it.

Users that encounter this issue, will always encounter this issue with one specific webview use-case, which happens to be the most critical one. We are not using a framework or UI for this, it is a frameless window. Our other usecases that do require some UI uses WFP (but those generally work).

We have customers that have reported this on windows 8, 10, and 11 (both pro and home editions)

I will be attaching some resources for you to review for both a working user account and a non-working user account on a developer machine with mock content. Both users are administrators on the machine, and the user account with the issue was actually the first user created on the machine.

Troubleshooting we have attempted

The code

Our project is large but in general, these are some parts of the code that are being used to both instantiate webview2 and then attempt to adjust the parent for some webview2 controllers.

We initialize webview2 - we use webview2 in more places than this, but we load up webviews for each slide and attempt to preload the content so that switching slides is seemless.

private async void InitializeWebView2()
        {
            CoreWebView2EnvironmentOptions options = new CoreWebView2EnvironmentOptions();
            Util.SetProxyServer(options);

            string dataPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData, Environment.SpecialFolderOption.Create), "folder");
            webView2Environment = await CoreWebView2Environment.CreateAsync(null, dataPath, options).ConfigureAwait(true);

            independentController = await webView2Environment.CreateCoreWebView2ControllerAsync(pptHWND).ConfigureAwait(true);

            controllersByURL = new Dictionary<string, CoreWebView2Controller>();
            controllerQueue = new Queue<CoreWebView2Controller>();

            for (int i = 0; i < CONTROLLER_COUNT; i++)
            {
                CoreWebView2Controller controller = await webView2Environment.CreateCoreWebView2ControllerAsync(pptHWND).ConfigureAwait(true);

                controller.CoreWebView2.Settings.UserAgent = Util.GenerateWebView2UserAgent();
                controller.CoreWebView2.Settings.AreDevToolsEnabled = Util.DevToolsEnabled();

                string listeners = Properties.Resources.VizEventListeners;
                await controller.CoreWebView2.AddScriptToExecuteOnDocumentCreatedAsync(listeners).ConfigureAwait(true);
                controller.CoreWebView2.WebMessageReceived += HandleWebMessageReceivedEvent;

                controller.CoreWebView2.NavigationCompleted += NavDone;
                controllerQueue.Enqueue(controller);
            }
        }

When a user presents a slide we call this, and start to set the parent of our webviews to the presenter window

 public void BeginSlideShow(int slideshowHWND, List<PESlide> slides, Func<bool> checkEOS)
        {
            foreach (CoreWebView2Controller controller in controllerQueue)
            {
                controller.ParentWindow = new IntPtr((int)slideshowHWND); //the debugger stops here the first time we get there, and throws an exception
            }
            UpdateAllSlides(slides);
        }

We get a stacktrace error

Exception: System.InvalidOperationException: CoreWebView2 members cannot be accessed after the WebView2 control is disposed. ---> System.Runtime.InteropServices.COMException: The group or resource is not in the correct state to perform the requested operation. (Exception from HRESULT: 0x8007139F)

Importance

Blocking. My app's basic functions are not working due to this issue.

Runtime Channel

Stable release (WebView2 Runtime)

Runtime Version

119.0.2151.72

SDK Version

stable

Framework

Other

Operating System

Windows 10, Windows 11, Other

OS Version

Microsoft Windows 11 Pro (10.0.22621.0))

Repro steps

Describing the Behavior

A working scenario results in the plugin working:

  1. We can open all UIs that depend on webviews and have them displayed properly. A
  2. If we look at task manager, we can see we many webview processes correctly loading a web page
  3. We see network traffic using fidler
  4. We do not see this webview exception in the logs.

A non-working scenario results in the plugin working in some cases where webviews are used

  1. We can open all but one UI that depends on webviews. When we encounter the error, the webview never even displays to the user. We can still use other webviews before and after through other UI
  2. If we look at task manager, we see webview processes that never load a page about:blank
  3. We see no network traffic or requests leaving PPT
  4. We see the webview exception in the logs.

Both accounts "SAY" according to the logs that they have the same DPI, zoom level, etc.

Repros in Edge Browser

Not Applicable

Regression

No, this never worked

Last working version (if regression)

No response

Attachments

LiangTheDev commented 11 months ago

WebView2 controller will be auto closed if the parent hwnd is destroyed. Is it possible that when switching slides, the old parent hwnd got destroyed before we reparent to the new hwnd? If that is the case, you might have to ensure that we reparent to a valid hwnd before the old one is destroyed.

benhbell commented 11 months ago

At face value this seems unlikely for two reasons

  1. comparing console logs for both working and non-working user sessions, we logged out the controllers in both cases, they both seem to be "ready" and exist before we attempt to set the parent. Is there a more direct or trustworthy way for us to check if the controller has been destroyed other than logging it before we try to set the parent?

  2. when we call BeginSlideShow after a user clicks on reader view in PPT, everything works fine and we never get the exception. When a user account encounters this, it is consistently throws the error when the user attempts to "Present" regardless of computer state.

  3. This works for a % of our users with the same setup as the % of users that consistently encounter this issue. It would be strange to me that we are destroying the parent before starting the slideshow, but only on some users accounts on the same machine.

LiangTheDev commented 11 months ago

WebView could be closed if its browser process crashes which has events, Close is called, or parent hwnd is destroyed. There is no events to inform that WebView is closed, not sure whether you could get event when parent hwnd is destroyed. If you take an ETW trace of the repro, then we should be able to figure out what is going on. If it were due to parent hwnd being destroyed, it could be a race condition for when the hwnd is destroyed and then the indeterministic repro.

benhbell commented 11 months ago

Ok. I see this documentation around ensuring we listen to and handle some events. but this does not explain why it is occurring for some users and not others, always.

The user behavior that replicates this is:

Creating a separate user on the same machine, and following the same steps, will never replicate the issue.

I cannot see why we would be closing or destroying the webview parent or process before, so the crashing might still be a possibility, but not the root cause. Why would the same code and software crash consistently for one user, and one browser window, but not the others if the behavior, machine, OS, software version and compatibility the same.

I have included two ETW traces as attachments (in one zip), we were not able to discern if there was a race condition. To me, a race condition means that sometimes, it should not repo, but with the repo user, it always repos. What kind of user settings might affect or influence a race condition if the user behavior is the same?

Additionally, is there a way to get a more clear error to know when we should be attempting to recreate the controller? or should we be checking for close events after any unexpected error.

LiangTheDev commented 11 months ago

Got it. For browser process failure, there is CoreWebView2.ProcessFailed event.

Looking at the trace, the events in not-working-trace.etl showed that WebView browser process for powerpnt.exe was still running and Microsoft.MSEdgeWebView\WebView2_APICalled events indicates that ParentWindow were successfully updated 4 times. Nothing obnormal. For the closed WebView we reject the call and don't log WebView2_APICalled event. So, if we encountered exception when setting ParentWindow, it should be that the webview is closed. However, the trace is not taken from the begin, there is not enough info to figure out why webviews were closed.

benhbell commented 11 months ago

I can generate a new trace. I will restart the machine, start the trace before I open any other program, and then re-attach it.

On Thu, Nov 30, 2023 at 7:42 PM Liang Zhao - MSFT @.***> wrote:

Got it. For browser process failure, there is CoreWebView2.ProcessFailed event.

Looking at the trace, the events in not-working-trace.etl showed that WebView browser process for powerpnt.exe was still running and Microsoft.MSEdgeWebView\WebView2_APICalled events indicates that ParentWindow were successfully updated 4 times. Nothing obnormal. For the closed WebView we reject the call and don't log WebView2_APICalled event. So, if we encountered exception when setting ParentWindow, it should be that the webview is closed. However, the trace is not taken from the begin, there is not enough info to figure out why webviews were closed.

— Reply to this email directly, view it on GitHub https://github.com/MicrosoftEdge/WebView2Feedback/issues/4206#issuecomment-1835150704, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHL75E77Y4MIVLA3LVIDQDYHERY5AVCNFSM6AAAAABABSNJUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZVGE2TANZQGQ . You are receiving this because you authored the thread.Message ID: @.***>

benhbell commented 11 months ago

Hi team, Please find another version of the trace logs.

Steps I followed

Can I get an email to use to send you the updated trace?

LiangTheDev commented 11 months ago

You can contact me at lzhao@microsoft.com.

benhbell commented 11 months ago

From our email convo

The trace indicates that Controller::Close is called at time stamp 75.883671333. If we then try to set parent hwnd on this closed controller, it will throw exception.

Is it possible to find Close() call in the code and add logging/breakpoint to figure out what’s going on?

I reviewed the code and cannot find anywhere where we call a close event

Screenshot 2023-12-07 at 1 24 38 PM
 private async void DpiMonitorSetup(IntPtr pptHWND)
        {
            string dataPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData, Environment.SpecialFolderOption.Create), "pollev_dpi");
            CoreWebView2Environment dpiMonitorEnvironment = await CoreWebView2Environment.CreateAsync(null, dataPath).ConfigureAwait(true);
            dpiMonitor = await dpiMonitorEnvironment.CreateCoreWebView2ControllerAsync(pptHWND).ConfigureAwait(true);
            dpiMonitor.ShouldDetectMonitorScaleChanges = true;
            dpiMonitor.BoundsMode = CoreWebView2BoundsMode.UseRasterizationScale;
            dpiMonitor.RasterizationScaleChanged += HandleRasterizationScaleChanged;
        }

Which is called from

 public void Startup(Microsoft.Office.Interop.PowerPoint.Application app)
        {
            string perfId = Guid.NewGuid().ToString();
            Util.PerfMarkToggle(true, "Startup powerpoint-" + perfId);
            try
            {
                MainThreadDispatcher = Dispatcher.CurrentDispatcher;
                //startup ppt code

                IntPtr pptWindowHandle = (IntPtr)app?.HWND;

                VizEmbedManager = new WebView2VizEmbedManager(pptWindowHandle);
                VizEmbedManager.HandleWebMessageReceivedEvent += HandleVizEmbedMessages;
                VizEmbedManager.HandleVizLogin += HandleVizEmbedLogin;

                WebAppModal = new WebView2WebAppModal(pptWindowHandle);
                WebAppModal.HandleWebMessageReceivedEvent += HandleWebAppModalMessages;

                DpiMonitorSetup(pptWindowHandle);

            }
            //catch (Exception ex)
        }
champnic commented 10 months ago

@benhbell Can you also do a search for "Close"? For any of the results in your screenshot where the CoreWebView2Controller is a variable, it could be "variable.Close()" instead of "CoreWebView2Controller". (If you already did this, please disregard this comment)

sf-mzh commented 10 months ago

Hi,

I'm focusing the same problem, however I can see that we doing something wrong. What I would like to know: Why does it work for more than 99% of our customer and a very small number of users gets into this failure.

So what we do is:

_hiddenHwndSource = new HwndSource(new HwndSourceParameters("Hidden", 1, 1) { WindowStyle = 0x02C00000 });
_hiddenHwndSource.RootVisual = _webView2Core;

and later on we switch it in to our view with the following code:

_hiddenHwndSource.RootVisual = new FrameworkElement(); // to unlink the webView2 Control from the hidden Source (I guess due to compatibility with older .NET Frameworks)
_hiddenHwndSource.Dispose(); // I guess this is the codeline which disposes the webview2Core (for a very small amount of users)
_webView2PresentationPanel.Children.Add(_webView2Core); // here it throws the exception

My problem is: I want to know how reproduce the problem, so I can relably say moving the Dispose one line down solves the problem. So under what conditions does Disposing the HwndTarget directly disposes the webview2Core? @benhbell: Does the one developer system has touch or pen input enabled?

-Markus

benhbell commented 9 months ago

@benhbell: Does the one developer system has touch or pen input enabled?

-Markus

the issue can be replicated on one user account on one machine, but not another user account on that same machine. In both accounts pen and touch is enabled

benhbell commented 9 months ago

hey @champnic there is no .close() that refers to webviews that are called during the steps that replicate the behavior. I did find a reference though, but commenting out that close and restarting did not help.

 public void Dispose()
        {
            actionWindow.Hide();
            actionWindow.Close();
            wpfWebView2.Dispose();
        }

We still have customers reporting this issue and have not found a resolution yet or explanation why it only affects certain users.

benhbell commented 9 months ago

@sf-mzh The only thing I can think of is that I plugged in an extra monitor that had a touch screen at some point, but I have tried to replicate that issue in the past, have not been able to replicate it.

champnic commented 9 months ago

@benhbell Would you be able to share the full stack trace where it crashes? Do you have a sample app which reproduces this issue we could use to debug further?

I see in the code you shared a "wpfWebView2" referenced, which presumably isn't the problem one because the one having problems is a frameworkless WebView2 right? For frameworkless there isn't a WebView2 to call .Close() on - instead we're looking for anywhere that .Close() is being called on a CoreWebView2Controller. Sounds like that's never done in the code?

benhbell commented 9 months ago

hey @champnic Thank you so much for the clarification.

I have looked for CoreWebView2Controller and cannot find any .Close() called on that. We set it to a variable which I have also looked for mentions of to see if I could find a close.

You can download the app from here I cannot promise you will be able to reproduce it. about 5% of users can with no clear pattern. I have a parallels machine with one user account that can, and another that can't.

I am attaching some logs from the case that reproduces it, the stack trace when the bug is reproduced is at line 107ish stacktrace.txt

champnic commented 9 months ago

For quick reference, here is the failure stack from your provided stacktrace.txt:


Exception: System.InvalidOperationException: CoreWebView2 members cannot be accessed after the WebView2 control is disposed. ---> System.Runtime.InteropServices.COMException: The group or resource is not in the correct state to perform the requested operation. (Exception from HRESULT: 0x8007139F)
   at Microsoft.Web.WebView2.Core.Raw.ICoreWebView2Controller.set_ParentWindow(IntPtr ParentWindow)
   at Microsoft.Web.WebView2.Core.CoreWebView2Controller.set_ParentWindow(IntPtr value)
   --- End of inner exception stack trace ---
   at Microsoft.Web.WebView2.Core.CoreWebView2Controller.set_ParentWindow(IntPtr value)
   at PollEverywhere.PowerPointAdaptee.WebView2VizEmbedManager.BeginSlideShow(Int32 slideshowHWND, List`1 slides, Func`1 checkEOS)
`
benhbell commented 9 months ago

Is there anything else that I can provide? additional traces, etc.

benhbell commented 9 months ago

I ran the regex command, but for some reason, when I build and debug I still see the latest non-release running

REG ADD HKLM\Software\Policies\Microsoft\Edge\WebView2\ReleaseChannelPreference /v * /t REG_SZ /d "1"

champnic commented 9 months ago

Which runtime is getting loaded, and which were you expecting that's installed on your machine?

sf-mzh commented 9 months ago

In our case: We found that all users reporting this issue were running on a dpi scaling higher than 100%. However we got many many more users also running on a dpi scaling higher than 100% not reporting this issue. My current hypothesis is that it could be relatet to some kind of 3rd party tool (maybe accessibility or window helper), but I did not start to verify it.

@champnic Could that be releated to an old runtime version? Which runtime version could be key for this issue?

Does

champnic commented 9 months ago

Running on a DPI scaling higher than 100% is the norm these days, so unclear if that's related or not. It's possible that it's related to an older runtime, although as far as we know this isn't a new regression. If you have data that showed a spike in hits or a clear beginning of when you started seeing this in relation to runtime version we'd be really interested in that.

sf-mzh commented 9 months ago

For us, the customer-reports with this specific crash started in december 2023. I got one dump which shows:

0:000> !vertarget
No export vertarget found
0:000> vertarget
Windows 10 Version 19045 MP (2 procs) Free x86 compatible
Product: WinNt, suite: SingleUserTS Personal
Edition build lab: 19041.1.amd64fre.vb_release.191206-1406
Machine Name:
Debug session time: Thu Jan 11 12:07:08.000 2024 (UTC + 1:00)
System Uptime: 20 days 13:21:43.430
Process Uptime: 0 days 0:01:34.000
  Kernel time: 0 days 0:00:07.000
  User time: 0 days 0:01:07.000
0:000> lmDvmEmbeddedBrowserWebView
Browse full module list
start    end        module name
56770000 56d1b000   EmbeddedBrowserWebView   (deferred)             
    Image path: C:\Program Files (x86)\Microsoft\EdgeWebView\Application\120.0.2210.121\EBWebView\x86\EmbeddedBrowserWebView.dll
    Image name: EmbeddedBrowserWebView.dll
    Browse all global symbols  functions  data
    Timestamp:        Fri Jan  5 01:30:25 2024 (65974DA1)
    CheckSum:         0059CD3C
    ImageSize:        005AB000
    File version:     120.0.2210.121
    Product version:  120.0.2210.121
    File flags:       0 (Mask 3F)
    File OS:          4 Unknown Win32
    File type:        2.0 Dll
    File date:        00000000.00000000
    Translations:     0409.04b0
    Information from resource tables:
        CompanyName:      Microsoft Corporation
        ProductName:      Microsoft Edge Embedded Browser WebView Client
        InternalName:     EmbeddedBrowserWebView.dll
        OriginalFilename: EmbeddedBrowserWebView.dll
        ProductVersion:   120.0.2210.121
        FileVersion:      120.0.2210.121
        FileDescription:  Microsoft Edge Embedded Browser WebView Client
        LegalCopyright:   Copyright Microsoft Corporation. All rights reserved.
0:000> !cpuid
CP  F/M/S  Manufacturer     MHz
 0  6,55,8  GenuineIntel    2159
 1  6,55,8  GenuineIntel    2159

On Jan/11 that user said that it crashes since ~3 weeks. So maybe the v120 runtime?

benhbell commented 9 months ago

Ok, Attempted to get the correct runtime/browser versions to test https://www.nuget.org/packages/Microsoft.Web.WebView2/1.0.2357-prerelease

  1. Added the project to nuget image
  2. Adjusted how we check version to resolve build issue in this ticket by changing
  3. return $"PollEvPresenter/{Version(VersionFormat.Clean)} NoRedirect/{Version(VersionFormat.Clean)} Office/{OfficeVersion} PowerPoint/{OfficeVersion} ({OsIdentifier}) WebView2Browser/{CoreWebView2Environment.GetAvailableBrowserVersionString()}"; to return $"PollEvPresenter/{Version(VersionFormat.Clean)} NoRedirect/{Version(VersionFormat.Clean)} Office/{OfficeVersion} PowerPoint/{OfficeVersion} ({OsIdentifier}) WebView2Browser/{CoreWebView2Environment.GetAvailableBrowserVersionString(null)}";
  4. I was seeing options.ChannelSearchKind' threw an exception of type 'System.NullReferenceException' so I adjusted the environments image
  5. Set regkeys to set channel preferences, but i was still seeing those errors, and the non-2357 webview being used Screenshot 2024-02-13 at 6 52 28 PM

    and

    Screenshot 2024-02-13 at 6 52 23 PM
  6. Installed the microsoft edge canary build
  7. added some options to force use the canary browser version when setting up the webview environment -> this finally got rid of not seeing the correct build in the environment CoreWebView2EnvironmentOptions options = new CoreWebView2EnvironmentOptions(default, default, default, default, default, CoreWebView2ReleaseChannels.Canary, CoreWebView2ChannelSearchKind.LeastStable);

I think that now I am loading the correct version of webview2 (pre-release) and the canary edge, but I still don't see that webview2 version, should I be looking somewhere else to validate that the correct webview2 version AND edge version are loading to test if the pre-release fixes the issue.

Screenshot 2024-02-13 at 5 03 15 PM

Screenshot 2024-02-13 at 12 32 01 PM

but I still replicate the issue. Additionally, I see a weird RenderAdapterLUID stack trace in the webview2 library

Screenshot 2024-02-13 at 5 04 08 PM

Screenshot 2024-02-13 at 12 32 41 PM

benhbell commented 9 months ago

and I would love to give you a new trace, but I get a strange error when trying to run the Wpr profile.

 wpr -start WebView2_CPU.wprp -filemode

        The request is not supported.

        Profile Id: Edge.WebView2.General.Verbose.File

        Error code: 0x80070032

        The request is not supported.

My version of wpr


Copyright (c) 2022 Microsoft Corporation. All rights reserved.```
champnic commented 8 months ago

Hey @benhbell - I'm not familiar with that error, but a quick search indicates that it might be due to the OS not supporting some feature in wpr. What OS are you using? @LiangTheDev any idea?

I've downloaded the app - what are the steps I need to do to potentially reproduce the issue, and how will I know if I've reproduced it or not? A video might help.

(Pardon if already answered) Is there a reason you are using a frameworkless WebView2 and choosing to try and reparent manually? Could you try the scenario using WPF?

The only thing I can think of is that something about the user account is causing a crash in the WebView2, and this manifests as an app crash when the WebView2 is attempted to be changed when reparenting. Liang mentioned the CoreWebView2.ProcessFailed event - are you listening and responding to that event?

LiangTheDev commented 8 months ago

I am not familiar with that wpr error code either, but normally updating wpr to the latest version should make it work.

benhbell commented 8 months ago

(Pardon if already answered) Is there a reason you are using a frameworkless WebView2 and choosing to try and reparent manually? Could you try the scenario using WPF?

Yes, in this case our use-case is to render the webview without a frame to replace a placeholder when the user presents. Even on users that it replicates, that same frameworkless webview will work in reader view mode in PTT

I am not familiar with that wpr error code either, but normally updating wpr to the latest version should make it work.

This is windows 11. I am going to send you another attempted trace. I was only able to create a trace running the webview2.wprp file. I uninstalled and re-installed the latest ADK, and even attempted it from the wprUI but still get that same error. I also notice.

Is that other file only compatible with some other WPR or ADK version? a few of these screenshots will show the WPR version.

Screenshot 2024-02-28 at 3 02 25 PM Screenshot 2024-02-28 at 3 32 03 PM Screenshot 2024-02-28 at 3 38 26 PM
ChadPE commented 7 months ago

Hey WebView2 folks,

Though @benhbell is no longer my Project Manager, I felt it was worth following-up on this thread to let you know what our team has learned. Late Friday we found the setting that causes this bug in our code: PowerPoint's Optimize for Compatibility setting:

optimize

@LiangTheDev said early in this thread "WebView2 controller will be auto closed if the parent hwnd is destroyed." We now realize that seems to be what is happening with this PowerPoint setting. Something in the code path the PowerPoint uses to load in compatibility mode closes the window handle that we attach to before building a new window handle; but we are currently exploring ways we can work around this problem.

We appreciate the help of everyone in this thread, we will follow-up if we find more relevant information or a solution.

champnic commented 7 months ago

@ChadPE Glad to hear you're getting closer to a solution. Thanks for the update! I'm closing this issue as it seems to be outside of WebView2, but let us know if you learn more and need us to take another look.

benhbell commented 1 week ago

Yea, I came back to this thread (I am no longer with that company) to check if it was ever fixed. Thanks for carrying the torch @ChadPE