zzzprojects / html-agility-pack

Html Agility Pack (HAP) is a free and open-source HTML parser written in C# to read/write DOM and supports plain XPATH or XSLT. It is a .NET code library that allows you to parse "out of the web" HTML files.
https://html-agility-pack.net
MIT License
2.62k stars 375 forks source link

Method not found: '!!0 HtmlAgilityPack.HtmlNode.GetEncapsulatedData()'. in UWP #517

Closed hippieZhou closed 11 months ago

hippieZhou commented 11 months ago

Here is what to include in your request to make sure we implement a solution as quickly as possible.

1. Description

Describe the issue or propose a feature.

when I use newest package(HtmlAgilityPack:1.11.53) in my UWP project, I have this issue: image

When I try to reference the source code into my project it works fine

2. Exception

If you are seeing an exception, include the full exception details (message and stack trace).

image

Exception message:Method not found: '!!0 HtmlAgilityPack.HtmlNode.GetEncapsulatedData()'.
Stack trace:null

image

3. Fiddle or Project

You can reproduce this exception with this sample Mvp.zip

4. Any further technical details

Add any relevant detail can help us, such as:

elgonzo commented 11 months ago

Note that HtmlAgilityPack has a platform favor for UAP 10.0 specifically made for UWP projects. However, your UWP app instead pulls in the .NET Standard 2.0 platform flavour of HtmlAgilityPack through the project dependency on your class library project. Certain HtmlAgilityPack functions implemented in the .NET Standard 2.0 platform flavour might not work in an UWP app.

The UAP platform flavour of HtmlAgilityPack does not offer the GetEncapsulatedData API, strongly indicating that the GetEncapsulatedData implementation is not usable with UWP apps.

It is important to note that adherence to the .NET Standard specification does NOT ensure platform compatibility; https://learn.microsoft.com/en-us/dotnet/standard/net-standard#net-standard-problems:

.NET Standard exposes platform-specific APIs. Your code might compile without errors and appear to be portable to any platform even if it isn't portable. When it runs on a platform that doesn't have an implementation for a given API, you get run-time errors.


Preferred and clean solution

The preferred solution to your problem is to multi-target your class library, creating separate build targets for .NET Standard and for UAP (https://learn.microsoft.com/en-us/nuget/create-packages/multiple-target-frameworks-project-file). The UAP flavour of your class library would then have to use the UAP flavour of HtmlAgilityPack. The .NETStandard20 flavour of your class library would then keep using the NETStandard20 flavour of HtmlAgilityPack.

In consequence, it would mean that the UAP flavour of your library will be unable to use GetEncapsulatedData and you would need to implement equivalent functionality in your class library that works within UWP applications.


Hacking: Tinker, tinker, waste some time and it perhaps still won't work...

If you however insist on your UWP app using the .NET Standard 2.0 class library that employs HtmlAgilityPack's GetEncapsulatedData method, you might try disabling DotNetNative in your UWP app and checking whether the exception will then still occur. The implementation of GetEncapsulatedData relies heavily on reflection. And it might be that the .NET Native code compiler might not handle this reflection-heavy implementation well (https://learn.microsoft.com/en-us/windows/uwp/dotnet-native/reflection-and-net-native), missing certain types, members or method invocations that are only being accessed/invoked through reflection. Note that disabling DotNetNative might perhaps not be sufficient to make the .NET Standard 2.0 flavour of HtmlAgilityPack work properly entirely even if it were to succeed in eliminating the exception you observed.

If disabling DotNetNative in your UWP app rectifies the issue but you still want to use the .NET Native runtime, you might try guiding the .NET Native compiler by writing a "runtime directives file". Sadly, i have no experience with UWP nor runtime directives files, so i can't give specific advice regarding precisely what you should put precisely how into the runtime directives file. But the documentation here might help or serve as starting point for further research into the issue: https://learn.microsoft.com/en-us/windows/uwp/dotnet-native/runtime-directives-rd-xml-configuration-file-reference. And i can't give no guarantee either that this will offer a way to make GetEncapsulatedData work in an UAP/UWP environment and not just be an exercise in futility...

P.S.: I am not the author/maintainer of the library, just a user.

hippieZhou commented 11 months ago

thank you for your reply, I think you are right, by the way, is there a better way for this scenario ? I'm confused that why when I add the source code (not package) to my solution, I don't got this exception.

elgonzo commented 11 months ago

is there a better way for this scenario?

I don't know what the scenario is. All i know is that you want to get some data from some HTML (by virtue of calling GetEncapsulatedData), but i don't know what the data, how complex the data and the source HTML is. Perhaps your code could just process the HTML structure directly to obtain the data you want and create the desired data object instances "manually". Whether this is simple or complicated i don't know, because judging this would require knowing what and how much data you need to obtain from where in the source HTML.

I'm confused that why when I add the source code (not package) to my solution, I don't got this exception.

I can't say without knowing in detail how exactly you included the HAP source code in your solution. There are too many unknowns regarding this for me to make a reasonable guess. A wild speculation might be that for some reason when using the nuget package, your UWP project (or some other dependency of that project) pulls in the UAP flavor of HtmlAgilityPack, effectively taking priority over the .NETStandard20 flavour the dependency on the class library project wants to pull in. The class library itself is still compiled against the .NETStandard20 flavour of HAP, but at runtime the UAP 10.0 flavour of HAP is present. And when the code in the class library is calling GetEncapsulatedData, the runtime isn't finding that method in the present HAP assembly, because that assembly is the UAP flavour of HAP.

Keep in mind it's just blind speculation on my part here. There might very well be other reasons at play here relating to peculiarities of the UAP/UWP build target and platform i am entirely oblivious about due to my glaring lack of UAP/UWP experience... ;-P

hippieZhou commented 11 months ago

with your explanation, I realized that I need to use manual way to create my web data