pnp / pnpcore

The PnP Core SDK is a modern .NET SDK designed to work for Microsoft 365. It provides a unified object model for working with SharePoint Online and Teams which is agnostic to the underlying API's being called
https://aka.ms/pnp/coresdk/docs
MIT License
298 stars 192 forks source link

GetPagesAsync results in a System.Text.Json error #1410

Closed estruyf closed 6 months ago

estruyf commented 7 months ago

Category

Describe the bug

Using the following code:

var pages = await context.Web.GetPagesAsync();

Throws the following exception on some of our sites:

System.Text.Json: The given key was not present in the dictionary.

Steps to reproduce

Following the steps described in the documentation to retrieve all pages from the site: https://pnp.github.io/pnpcore/using-the-sdk/pages-intro.html#load-all-the-pages-on-a-site

Expected behavior

Retrieving all pages.

Environment details (development & target environment)

jansenbe commented 7 months ago

@estruyf : any thing special in the pages library? Can you share a stack trace? Also, if there are not too many pages in the pages library try to isolate the one failing

estruyf commented 7 months ago

Happening on a couple of sites, one where there are just 4 pages with only OOTB web parts:

   at System.Text.Json.JsonElement.GetProperty(String propertyName)
   at PnP.Core.Model.SharePoint.PageWebPart.FromHtml(IElement element)
   at PnP.Core.Model.SharePoint.Page.LoadFromHtml(String html, String pageHeaderHtml)
   at PnP.Core.Model.SharePoint.Page.LoadPageAsync(IList pagesLibrary, IListItem item)
   at PnP.Core.Model.SharePoint.Page.LoadPagesAsync(PnPContext context, String pageName)
   at PnP.Core.Model.SharePoint.Web.GetPagesAsync(String pageName)
   at Involv.Scan.Func.Helpers.SiteProcessor.ProcessPages(PnPContext context) in /SiteProcessor.cs:line 98
jansenbe commented 7 months ago

@estruyf : weird, this code has not changed for a long time, this then might be related to how the page content is stored as html blob. These pages, are they created using code or manually via the UI?

estruyf commented 7 months ago

It are a bunch of test pages with all sorts of webparts as it is now only running against a test environment.

jansenbe commented 7 months ago

I just tested with some of my sites, so far things just work. You'll have to load the pages one by one (providing pagename as input in the method) to figure out which page is breaking, I need more input to be able to reproduce this. Alternatively you can also debug the binaries as we support sourcelink, see https://pnp.github.io/pnpcore/using-the-sdk/basics-debug.html for how to configure that.

estruyf commented 7 months ago

@jansenbe understand, getting used to C# again 😄.

What I just did is getting the site pages library items, and retrieve all pages one by one. There are two pages returning an error:

image

Retrieving the page from CLI for Microsoft 365 works fine

I removed the web part from the second page that gave me an issue, added it again and the issue went way.

jansenbe commented 7 months ago

@estruyf : if the page itself is corrupt it can be the case that the API fails to load the page...not sure what you want here to be fixed ? Without a clear repro there's nothing I can do to fix things

estruyf commented 7 months ago

Not sure what is possible, but I would expect that when trying to retrieve all Site Pages, it wouldn't fail when there is just one broken page.

What do you mean with a clear repo?

jansenbe commented 7 months ago

@estruyf : I need to be able to reproduce the issue from my end in order to understand what goes wrong...but as long as the page you're trying to load gives an error in the UI I'm fine with it not loading correctly in PnP Core SDK, we parse the underlying HTML structure and if that's messed up then the page will not load. The only place where I've seen these kind of issues in the past was when the page was created/manipulated using code resulting in a broken experience.

Just eating the exceptions for the failing pages and continuing is not a good approach in my opinion as failing pages are very exceptional

estruyf commented 7 months ago

@jansenbe I can confirm it is on a manually created page with a broken web part.

What I did is:

protected onInit(): Promise<void> {
  throw new Error("Method not implemented.");
}

image

That should be enough to reproduce the issue, and I can also share the WP if you want. Feel free to reach out via Teams if you want to have a chat about it.

jansenbe commented 6 months ago

@estruyf : The 'broken' web part was wrongly identified as header control, I've updated the logic to distinguish between header controls and web parts and included a 'fallback' scenario to grab the web part data. This prevents errors when loading your demo "broken" web part. Fix will be included in the next nightly, closing this issue now.