SDKits / ExamineX

Issue tracker for ExamineX
https://examinex.online
5 stars 0 forks source link

Umbraco content not being indexed, only media #93

Closed chrden closed 9 months ago

chrden commented 9 months ago

Version Umbraco - 12.1.2 ExamineX.AzureSearch - 5.0.1 ExamineX.AzureSearch.Umbraco - 5.0.1

Summary I cannot seem to get my Umbraco Content nodes to be indexed at all. Only Media.

Detail I am continually getting the below error when building the External index:

Azure.RequestFailedException: The request is invalid. Details: An unexpected 'StartArray' node was found when reading from the JSON reader. A 'PrimitiveValue' node was expected.
Status: 400 (Bad Request)

My ConfigureServices in Startup.cs looks like this:

public void ConfigureServices(IServiceCollection services)
{
    var config = services
        .AddUmbraco(_env, _config)
        .AddBackOffice()
        .AddWebsite()
        .AddComposers()
        .AddCustomConfigureOptions()
        .AddCustomComponents()
        .AddCustomContentFinders()
        .AddCustomNotificationHandlers()
        .AddCustomServices();

   ...
}
public static IUmbracoBuilder AddCustomComponents(this IUmbracoBuilder builder)
{
    builder.Components().Append<ExternalIndexComponent>();

    return builder;
}

public static IUmbracoBuilder AddCustomConfigureOptions(this IUmbracoBuilder builder)
{
    builder.Services.PostConfigure<AzureSearchIndexOptions>(UmbracoIndexes.ExternalIndexName,
        options =>
        {
            options.FieldDefinitions.AddOrUpdate(new FieldDefinition(ExamineConstants.Fields.NodeNameSortable, AzureSearchFieldDefinitionTypes.FullTextSortable));
            options.FieldDefinitions.AddOrUpdate(new FieldDefinition(ExamineConstants.Fields.Schema.PageTitle, AzureSearchFieldDefinitionTypes.Raw));
            options.FieldDefinitions.AddOrUpdate(new FieldDefinition(Koben.Umbraco.Constants.Examine.Fields.HomeId, AzureSearchFieldDefinitionTypes.Integer));
            options.FieldDefinitions.AddOrUpdate(new FieldDefinition(ExamineConstants.Fields.DisplayDate, AzureSearchFieldDefinitionTypes.DateTime));
            options.FieldDefinitions.AddOrUpdate(new FieldDefinition(Koben.Umbraco.Constants.Examine.Fields.Umbraco.SortOrder, AzureSearchFieldDefinitionTypes.Integer));
        }
    );

    return builder;
}

I also have the following Logging setup in my Development environment:

"Serilog": {
    "MinimumLevel": {
        "Default": "Information",
        "Override": {
            "ExamineX.AzureSearch.AzureSearchIndex": "Debug"
        }
    },

The above logging produces the attached screenshots on each rebuild of the External Index 1 2

Where the Error logs are is where the Azure.RequestFailedException is logged

Testing I have tried with and without custom FieldDefinitions (by commenting out the code shown above) and the same logs occur each time.

Only the Media section appears to be successfully indexed each time

Let me know if any further information is needed.

Thanks

chrden commented 9 months ago

@Shazwazza From the error, I believe a field on one of my document types may be causing the issue by passing an array where the analyser is expecting an int, for example, however I cannot determine which field or value is causing the issue as the logs do not contain that information.

chrden commented 9 months ago

@Shazwazza After a lot of digging, I found the issue.

It turns out I had a field on 2 doc types called icon which was appending a value to the existing icon field which is used for the Document type icon. This meant there were two values (hence StartArray) being found for icon when it was trying to set it to a string (hence PrimitiveValue).

I found this by adding logging to the TransformingIndexValues event and outputting the node ID for each node until I found the ones that were failing.

Do you think you could add this type of logging, or something similar, so that it makes it clearer in the logs which nodes are causing the error?

Thanks

Shazwazza commented 9 months ago

Hi @chrden thanks for the report and the research. I do wish that Azure Search errors were a lot less vague! Yeah the StartArray type of errors you get are typically due to expecting one or multiple values and fields configured differently. I totally support adding additional logging one way or another to make debugging this easier.

I need to make sure we don't have any performance penalties in doing such things. I'll try to look into the error logging that is currently done to see if there's an easy way to add more information there.