AngleSharp / AngleSharp.Js

:angel: Extends AngleSharp with a .NET-based JavaScript engine.
https://anglesharp.github.io
MIT License
103 stars 22 forks source link

Implict wait for content to load #57

Closed s0thl closed 4 years ago

s0thl commented 5 years ago

I just have discovered AngleSharp and I'm trying to learn about it.

I want to use it for testing my website, where content is dynamically generated.

Currently during loading, AngleSharp loads a document with the load spinner and not the actual loaded, ready document.

How can I implicitly wait during OpenAsync task for the element in the document to be loaded and visible?

FlorianRappl commented 5 years ago

AngleSharp is a pure engine and does not contain extra stuff such as a JavaScript (JS) engine, which would be required for your case.

There is an experimental JS engine, but I guess it will not work unless the used JS is very simple.

Please in the future provide at least some background info such as what configuration you are using (e.g., show the code you apply).

Hope that helps!

s0thl commented 5 years ago

I understand that and I'm actually using AngleSharp.Js library, just need further instructions on how can I manage to solve the issue or at least try if it will work for me, because all of the actual google results lead to your replies saying "it will probably not work" without any context or details on where to start to give it an actual try or contribute if some code need to be improved.

This is my actual code:

var config = Configuration.Default
               .WithDefaultLoader(new LoaderOptions
               {
                   IsResourceLoadingEnabled = true,
                   IsNavigationDisabled = false
               })
            .WithJs()
            .WithCss()
            .WithRenderDevice()
            .WithCookies();

var context = BrowsingContext.New(config);
var document = await context.OpenAsync("https://mywebsite.com").WhenStable();

I can use my website backend to expose some JS variables like window.__isContentLoaded boolean, then, in AngleSharp I would want to spin until such variable is set and true. This should be simple enough I guess.

Could you elaborate on this topic?

FlorianRappl commented 5 years ago

It will not work because the JS will most likely crash. There can be various reasons for that:

Due to Jint's nature the exact source is not easy to debug. Usually, you need a fairly complex JS to trigger the issue. Then you need to boil it down to the least code required to still see the issue. And as a final step the code in AngleSharp / AngleSharp.Js needs to be adjusted (without breaking existing tests) to solve this particular issue. At the moment this is unfortunately quite tedious ...

You can - of course - give it a try (without going down the rabbit hole as described). The __isContentLoaded can work, but you could also spin until a certain element is there in the DOM (the latter would be more general and be equivalent to frameworks such as Selenium - they do the same in their "element visible" kind of APIs).

Since you use WhenStable I assume you are on the preview version of AngleSharp (and AngleSharp.Js respectively)?

s0thl commented 5 years ago

Thank you for the detailed reply.

you could also spin until a certain element is there in the DOM

Can you please show an example of how such spin should be implemented? Been reading docs, but couldn't find appropriate methods for it.

Since you use WhenStable I assume you are on the preview version of AngleSharp (and AngleSharp.Js respectively)?

Not sure, since I installed all of the packages using nuget (there were two packages implementing JavaScript engine and I used AngleSharp.Js instead of AngleSharp.Scripting.JavaScript).

s0thl commented 5 years ago

@FlorianRappl

I'm out of ideas - documentation doesn't include anything in this topic.

This one spins infinitely.. like the document is never updated:

var document = await context.OpenAsync("https://mywebsite.com").WhenStable();

await document.WaitForReadyAsync();

while (document.QuerySelector("form") == null)
{
    await Task.Delay(1000);
}
FlorianRappl commented 5 years ago

As I wrote - the document is most likely never updated as the JS crashes. I guess there is a chance you could find a log entry in the debug log about a JavaScriptException. Again, the documentation mentions that AngleSharp.Js is not ready yet for such tasks - it could work, but my feeling here is that it does not (also I haven't received any detailed infos on the script, e.g., is it based on some framework? which libraries does it use? how was it produced, e.g., ES target? ...).

This spinning here is not really AngleSharp specific (after all that's just a simple polling mechanism) thus is not in the documentation. I guess we could add a helper method (WaitUntilAvailable) and mention it in the docs.

Since you read the docs I assume you also know why there are AngleSharp.Scripting.Js and AngleSharp.Js (https://github.com/AngleSharp/AngleSharp/blob/master/doc/Migration.md#scripting).

Thus this is an infinite loop (and you should set a max. time - this is what frameworks like Selenium do; e.g., 10 seconds - you can easily achieve this with a CancellationToken which can be automatically fired after some time).

HTH!

s0thl commented 5 years ago

Hi,

I haven't received any detailed infos on the script, e.g., is it based on some framework? which libraries does it use? how was it produced, e.g., ES target? ...).

Those are mostly ES6 webpack bundles scripts. Nothing fancy.

Since you read the docs I assume you also know why there are AngleSharp.Scripting.Js and AngleSharp.Js

Well, since AngleSharp.Scripting.Js is not compatible with v0.10 and I'm using latest available nuget version of AngleSharp (that is 0.12.1) I guess this is the way to go to use AngleSharp.Js?

Thus this is an infinite loop (and you should set a max. time - this is what frameworks like Selenium do; e.g., 10 seconds - you can easily achieve this with a CancellationToken which can be automatically fired after some time).

Infinite loop was just for the testing purposes. In case I was able to make it work, I would script that properly and perhaps create WaitUntilAvailable method.

In this case, if I have done everything properly and still unable to make it work and you don't have more solutions, then I think that maybe AngleSharp isn't the right tool for me at this time. Unfortunately, because I really hoped that it will be a brilliant lightweight solution for the automation testing. I still think I might dig deeper into the JS package to understand the issue and help improve it. Will do it in my spare time. 👍

Thank you for your engagement!

FlorianRappl commented 5 years ago

Yes I think this is the right way forward.

We have that use case on our roadmap, but unfortunately our resources are limited and the undertaking is massive (to say the least...).

I know you wanted to avoid a larger / more bloated solution, but at this point in time our recommendation for such a use case is definitely using browser automation. Note: This does not mean you need to use Selenium, but rather using the web driver spec (https://w3c.github.io/webdriver/).

Any contribution to AngleSharp.Js (e.g., just providing a case that does not work with a MWE to reproduce and fix it) would be much appreciated :beers:!

FlorianRappl commented 5 years ago

Reopen for the WaitUntil... helper methods and additional documentation.

Also for potential enhancements / bug reports on that / similar topics.

FlorianRappl commented 4 years ago

Landed in devel.