umbraco / Umbraco-CMS

Umbraco is a free and open source .NET content management system helping you deliver delightful digital experiences.
https://umbraco.com
MIT License
4.46k stars 2.68k forks source link

First member update after rebuilding the index and application start cuts off all custom indexed fields from custom member index #8766

Closed deekoulos closed 4 years ago

deekoulos commented 4 years ago

Umbraco 8.6.4

Steps to reproduce (locally):

Add some property types to the standard member type: "Member" image

Create a test member of that member type and add a value to the created property type image

Create a custom member index

public class MemberIndexCreator : LuceneIndexCreator, IUmbracoIndexesCreator
    {
        private readonly IProfilingLogger _profilingLogger;
        private readonly ILocalizationService _localizationService;
        private readonly IPublicAccessService _publicAccessService;

        // Since Umbraco 8 has dependency injection out of the box, we can use it to inject
        // the different services that we need.
        public MemberIndexCreator(IProfilingLogger profilingLogger,
            ILocalizationService localizationService,
            IPublicAccessService publicAccessService
        )
        {
            _profilingLogger = profilingLogger;
            _localizationService = localizationService;
            _publicAccessService = publicAccessService;
        }

        // Noticed that we return a collection of indexes? Technically you
        // can create multiple indexes in an indexCreator :) You can have a look at
        // UmbracoIndexesCreator.cs in the CMS core and see how the CMS does that.
        public override IEnumerable<IIndex> Create()
        {
            var index = new UmbracoMemberIndex("FeedbaxMembersIndex",
                new UmbracoFieldDefinitionCollection(),
                CreateFileSystemLuceneDirectory("FeedbaxMembers"),
                new CultureInvariantWhitespaceAnalyzer(),
                //new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30),
                _profilingLogger,
                new MemberValueSetValidator(includeItemTypes: new string[] { "Member" }, null, null, null));

            return new[] { index };
        }
    }

    public class MemberComponent : IComponent
    {
        private readonly IExamineManager _examineManager;
        private readonly MemberIndexCreator _memberIndexCreator;

        public MemberComponent(IExamineManager examineManager, MemberIndexCreator memberIndexCreator)
        {
            _examineManager = examineManager;
            _memberIndexCreator = memberIndexCreator;
        }

        public void Initialize()
        {
            // Because the Create method returns a collection of indexes,
            // we have to loop through them.
            foreach (var index in _memberIndexCreator.Create())
            {
                index.FieldDefinitionCollection.AddOrUpdate(new FieldDefinition("minimumProjectSize", FieldDefinitionTypes.Integer));

                _examineManager.AddIndex(index);
            }
        }

        public void Terminate() { }
    }

    public class MemberComposer : IUserComposer
    {
        public void Compose(Composition composition)
        {
            composition.Components().Append<MemberComponent>();
            composition.RegisterUnique<MemberIndexCreator>();
        }
    }

Run the application, navigate to BO > Examine Management > Click on the created index > Rebuild the index > search for the created test member image

Now you should see that the member is indexed properly, including the additional property types with the correct values for the indexed fields image

Stop the application, and start it again, so the custom index should get created and inserted again

Navigate to BO > Examine Management > Click on the created index > search for the created test member It should show exactly the same state of the index for the member, all fields are indexed properly

Navigate to BO > Members > All Members > Select the created test member > Change the value of the created property type > Save the member image

Navigate to BO > Examine Management > Click on the created index > search for the created test member Now it shows only 9 indexed fields for the member image

Click on the fields and make a note of the fields image

Go back to the members section, select the test member , change the value for the added property type again, save the member image

Navigate to BO > Examine Management > Click on the created index > search for the created test member Now it shows ALL indexed fields for the member which is the correct state, it should have shown after the first update image

After many attempts, I recognized this pattern over and over again. It doesn't matter if you update the member through BO or UI. So after rebuilding the index and restarting the application, the first member update cuts off all custom property types, only 9 remain. The 2nd member update and all updates after that index all property types properly.

Why is this pattern happening? Within the states where only 9 fields are indexed, the member is not findable through the index by all other fields, which are needed in the application. Rebuilding the index get all fields indexed, but these should be indexed right after the first member update, without rebuilding the index, since EnableDefaultEventHandler is set to true.

Best Dee

deekoulos commented 4 years ago

I finally found a workaround for the index update issue: Rebuilding the custom member index, right after the creation and adding to the umbraco examine indexes.

In the documentation of creating a custom index, there is no mentioning of the need for rebuilding the index. So I am not sure, if this is the solution and should be updated also in the documentation or just a workaround, since this behaviour is not normal?

Maybe this is just the case, when we create a custom member index and not a custom content index. The documentation is based on custom content index and document types: https://our.umbraco.com/Documentation/Reference/Searching/Examine/indexing/#index-based-on-document-types

Inspired by that, I managed to create a custom member index based on a member type, which is a bit different approach.

public class MemberComponent : IComponent
    {
        private readonly IExamineManager _examineManager;
        private readonly MemberIndexCreator _memberIndexCreator;
        private readonly IndexRebuilder _indexRebuilder;

        public MemberComponent(IExamineManager examineManager, MemberIndexCreator memberIndexCreator, IndexRebuilder indexRebuilder)
        {
            _examineManager = examineManager;
            _memberIndexCreator = memberIndexCreator;
            _indexRebuilder = indexRebuilder;
        }

        public void Initialize()
        {
            // Because the Create method returns a collection of indexes,
            // we have to loop through them.
            foreach (var index in _memberIndexCreator.Create())
            {
                index.FieldDefinitionCollection.AddOrUpdate(new FieldDefinition("minimumProjectSize", FieldDefinitionTypes.Integer));

                _examineManager.AddIndex(index);
            }

            _indexRebuilder.RebuildIndex("FeedbaxMembersIndex");
        }

        public void Terminate() { }
    }
deekoulos commented 4 years ago

Update:

RebuildIndex causes sometimes a Lucene Directory is Locked issue. So I need to unlock the DIR before its added to the examineManager:

        public void Initialize()
        {
            // Because the Create method returns a collection of indexes,
            // we have to loop through them.
            foreach (var index in _memberIndexCreator.Create())
            {
                index.FieldDefinitionCollection.AddOrUpdate(new FieldDefinition("minimumProjectSize", FieldDefinitionTypes.Integer));
                index.FieldDefinitionCollection.AddOrUpdate(new FieldDefinition("memberGroups", FieldDefinitionTypes.FullTextSortable));

                var luceneIndex = index as LuceneIndex;
                var dir = luceneIndex.GetLuceneDirectory();
                if (IndexWriter.IsLocked(dir))
                    IndexWriter.Unlock(dir);

                _examineManager.AddIndex(index);
            }

            _indexRebuilder.RebuildIndex("FeedbaxMembersIndex");
        }
nul800sebastiaan commented 4 years ago

Hi there @deekoulos Indeed, I can reproduce this, what an odd problem.

Just for reference if someone wants to try this out by dropping the code in App_Code:

using Umbraco.Core;
using Umbraco.Core.Composing;
using Umbraco.Core.Services;
using System.Collections.Generic;
using Examine;
using Examine.LuceneEngine;
using Umbraco.Core.Logging;
using Umbraco.Examine;
using Umbraco.Web.Search;

namespace TestNameSpace
{
    public class MemberIndexCreator : LuceneIndexCreator, IUmbracoIndexesCreator
    {
        private readonly IProfilingLogger _profilingLogger;
        private readonly ILocalizationService _localizationService;
        private readonly IPublicAccessService _publicAccessService;

        // Since Umbraco 8 has dependency injection out of the box, we can use it to inject
        // the different services that we need.
        public MemberIndexCreator(IProfilingLogger profilingLogger,
            ILocalizationService localizationService,
            IPublicAccessService publicAccessService
        )
        {
            _profilingLogger = profilingLogger;
            _localizationService = localizationService;
            _publicAccessService = publicAccessService;
        }

        // Noticed that we return a collection of indexes? Technically you
        // can create multiple indexes in an indexCreator :) You can have a look at
        // UmbracoIndexesCreator.cs in the CMS core and see how the CMS does that.
        public override IEnumerable<IIndex> Create()
        {
            var index = new UmbracoMemberIndex("FeedbaxMembersIndex",
                new UmbracoFieldDefinitionCollection(),
                CreateFileSystemLuceneDirectory("FeedbaxMembers"),
                new CultureInvariantWhitespaceAnalyzer(),
                _profilingLogger,
                new MemberValueSetValidator(new[] { "Member" }, null, null, null));

            return new[] { index };
        }
    }

    public class MemberComponent : IComponent
    {
        private readonly IExamineManager _examineManager;
        private readonly MemberIndexCreator _memberIndexCreator;

        public MemberComponent(IExamineManager examineManager, MemberIndexCreator memberIndexCreator)
        {
            _examineManager = examineManager;
            _memberIndexCreator = memberIndexCreator;
        }

        public void Initialize()
        {
            // Because the Create method returns a collection of indexes,
            // we have to loop through them.
            foreach (var index in _memberIndexCreator.Create())
            {
                index.FieldDefinitionCollection.AddOrUpdate(new FieldDefinition("minimumProjectSize", FieldDefinitionTypes.Integer));

                _examineManager.AddIndex(index);
            }
        }

        public void Terminate() { }
    }

    public class MemberComposer : IUserComposer
    {
        public void Compose(Composition composition)
        {
            composition.Components().Append<MemberComponent>();
            composition.RegisterUnique<MemberIndexCreator>();
        }
    }
}

Here's what the repro looks like, I even managed to get a weird number of 17 fields. It helps to do a recycle before changing the member data, but that was not always needed, like in this case.

8766

deekoulos commented 4 years ago

Hey @nul800sebastiaan,

thank god you can reproduce this... yeah, indeed a very weird issue. Fortunately I have a workaround for this so far, but still very odd...

deekoulos commented 4 years ago

Hi @nul800sebastiaan ,

I have an update for this. Although I thought that I have a working workaround, It's not working all the time. Here is the case:

By rebuilding the index as mentioned in my workaround, the first update is not cutting off the indexed fields anymore, BUT if I update the member several times, for instance by setting the "Is Approved" property to "true" and save, set it back to false and save, and again back to true and so on, and during these updates, after every member save, I checked the current state of the member in the index and was able to observe, that some fields are still getting cut off and come back by the next save.

What I could observe: fields like "umbracoMemberLockedOut" "createDate", "creatorId", "level" etc. (all system fields) are those, which are disappearing from the index, without any reasons. When these fields are getting out of the index, the index is not able to update itself and raises "Error indexing queue items" Exceptions:

Exception
System.ArgumentNullException: Value cannot be null.
Parametername: key
   bei System.Collections.Concurrent.ConcurrentDictionary`2.TryGetValue(TKey key, TValue& value)
   bei Umbraco.Examine.UmbracoFieldDefinitionCollection.TryGetValue(String fieldName, FieldDefinition& fieldDefinition) in D:\a\1\s\src\Umbraco.Examine\UmbracoFieldDefinitionCollection.cs:Zeile 72.
   bei Examine.LuceneEngine.Providers.LuceneIndex.AddDocument(Document doc, ValueSet valueSet, IndexWriter writer) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:Zeile 676.
   bei Umbraco.Examine.UmbracoExamineIndex.AddDocument(Document doc, ValueSet valueSet, IndexWriter writer) in D:\a\1\s\src\Umbraco.Examine\UmbracoExamineIndex.cs:Zeile 168.
   bei Examine.LuceneEngine.Providers.LuceneIndex.ProcessIndexQueueItem(IndexOperation op, IndexWriter writer) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:Zeile 1216.
   bei Examine.LuceneEngine.Providers.LuceneIndex.ProcessQueueItem(IndexOperation item, IndexWriter writer) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:Zeile 1035.
   bei Examine.LuceneEngine.Providers.LuceneIndex.ForceProcessQueueItems(Boolean block) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:Zeile 894.

I have to admit that, this is really driving me mad, since the GoLive deadline is in a few weeks, and the whole business case is based on profiling large amount of members (> 15k members). The member service has extreme performance issues handling many members, which was also costing a lot of dev and test time. Using the index is theoretically the right approach, but not when it's so instable with these kinds of errors.

So what are the next steps? Is anyone already working on this issue? Can any of the examine developers provide a quick and reliable workaround? I only see the status "needs investigation" which is not really clear to me.

Please support on this to achieve a better Umbraco experience.

Thanks Dee

nul800sebastiaan commented 4 years ago

@deekoulos I understand that you're getting worried and if it's critical I'm going to send you to our sales people who will happily sell you a support contract 😅

Let's start with this - I got some feedback from a colleague:

And one more comment which I'm going to quote verbatim as I'm not sure what it means, I'm no expert:

Perhaps related, but each index has a ValueSetValidator which is synonymous with the old Examine config to filter fields. We ship with MemberValueSetValidator which can be changed but have a look at that class since it limits the fields that go into the member index. Perhaps there's an issue with sometimes letting additional fields in (which it shouldn't do). In their code, they are adding additional member fields but not changing the validator.

Maybe that's something you can look into, I'm not sure what that entails.

deekoulos commented 4 years ago

@nul800sebastiaan thanks a lot for sharing the feedbacks from your colleague. If you could keep on the communication to find a solution/workaround, you would save my a** really!!!

Regarding the additional index comment: Extending the built in member index was my first approach. But according to the documentation, the mentioned approach for extending the member index is not working. see my related Umbraco doc issue:

https://github.com/umbraco/UmbracoDocs/issues/2683

So maybe someone takes a look into that issue, so I don't need a custom member index. The mentioned point from your colleague regarding having multiple indexes is absolutely correct, which I also pointed out in the doc issue above.

Regarding the rebuilding approach: I don't rebuild the index after each update or at any other times. The only rebuilding is on application start, after the index is created and added to the examine indexes. So I guess, in this case it should be fine so far.

Regarding the ValueSetValidator: I couldn't find any documentation on how to build a custom member index. So what I did is, based on the documentation here: https://our.umbraco.com/Documentation/Reference/Searching/Examine/indexing/#index-based-on-document-types

I managed to create a custom member index, where I pass the member type "Member" as a parameter, and the MemberValueSetValidator creates automatically all the member fields. The MemberValuesetValidator only allows following parameters: IncludeItemTypes, ExcludeItemTypes, IncludeFields, ExcludeFields. Nothing else. So I don't know, what your colleague means by "In their code, they are adding additional member fields but not changing the validator."

Include Fields is for my understanding the same approach like includeItemTypes, but there you need to define every field manually and add it to the include fields parameter. I am not sure, if I understood correctly the point, that the memberValueSetValidator limits the fields that go into the member index. Since after a rebuild, all fields are indexed properly, and as I mentioned in my last comment, the fields that are getting out of the index are all system fields not application fields.

deekoulos commented 4 years ago

Update:

I changed the code a bit, by replacing all the parameters with null:

public override IEnumerable<IIndex> Create()
        {
            var index = new UmbracoMemberIndex("FeedbaxMembersIndex",
                new UmbracoFieldDefinitionCollection(),
                CreateFileSystemLuceneDirectory("FeedbaxMembers"),
                new CultureInvariantWhitespaceAnalyzer(),
                //new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30),
                _profilingLogger,
                new MemberValueSetValidator(null, null, null, null));

            return new[] { index };
        }

It's indeed indexing all member fields, without mentioning the member type. The core issue that fields get out of the index still remains. Unfortunately there is no documentation to this approach, neither on OUR or in the code itself, saying what the parameters are for.

It would be great, to include code documentation or at least code comments into the Definition of Done for umbraco developments.

nul800sebastiaan commented 4 years ago

This is all completely new to me, and I've never tried this before so let me know if I'm missing something. I've tested some things with a few members and adding a single extra field (minimumProjectSize).

Since I've been told: don't add new indexes, I didn't. I took this documentation: https://our.umbraco.com/documentation/Reference/Searching/Examine/indexing/ and following the first two steps I got this code:

using Examine;
using Umbraco.Core;
using Umbraco.Core.Composing;
using Umbraco.Core.Services;
using Umbraco.Examine;

namespace Testing
{
    public class CustomizeIndexComposer : ComponentComposer<CustomizeIndexComponent>
    {
    }

    public class CustomizeIndexComponent : IComponent
    {
        private readonly IExamineManager _examineManager;

        public CustomizeIndexComponent(IExamineManager examineManager)
        {
            _examineManager = examineManager;
        }

        public void Initialize()
        {
            // get the member index
            if (!_examineManager.TryGetIndex(Constants.UmbracoIndexes.MembersIndexName, out IIndex index))
                return;

            // add a custom field type
            var projectSize = new FieldDefinition("minimumProjectSize", FieldDefinitionTypes.Integer);
            index.FieldDefinitionCollection.TryAdd(projectSize);
        }

        public void Terminate()
        {
        }
    }

    public class ConfigureIndexComposer : IUserComposer
    {
        public void Compose(Composition composition)
        {
            // replace the default IUmbracoIndexConfig definition
            composition.RegisterUnique<IUmbracoIndexConfig, CustomIndexConfig>();
        }
    }

    public class CustomIndexConfig : UmbracoIndexConfig, IUmbracoIndexConfig
    {
        public CustomIndexConfig(IPublicAccessService publicAccessService) : base(publicAccessService)
        {
        }

        IValueSetValidator IUmbracoIndexConfig.GetMemberValueSetValidator()
        {
            var excludeFields = Constants.Conventions.Member.GetStandardPropertyTypeStubs().Keys;

            return new MemberValueSetValidator(null, null, null, excludeFields);
        }
    }
}

From what I understand:

image

image

This now all works and I don't "lose" fields between restarts and updates.

I think this is also completely on spec with your requirements, no need to manually add fields if the member type ever updates, etc.

So I would encourage you to riff of off this, remove (really, really, really!!) remove all that index rebuilding code. Remove the custom index code and start clean with this and see how it goes for you. I only have 3 members but in my previous tests I did as well, so now that I can't repro any more, I have good hope you wont be able to repro any more either.

deekoulos commented 4 years ago

@nul800sebastiaan you are my HERO! This is it, it's working exactly like i needed. A monster high 5er!

And now because of your solution, I also found out why my code for extending the member index was not executing. Like Dave here https://our.umbraco.com/forum/using-umbraco-and-getting-started/102514-search-member-index-by-custom-fields I did the same mistake. The method composition.RegisterUnique<IUmbracoIndexConfig, CustomIndexConfig>(); is not defined within the Umbraco.Core.Composing namespace. This namespace defines composition.RegisterUniqueFor<IUmbracoIndexConfig, CustomIndexConfig>(); which is not triggering the needed method GetMemberValueSetValidator. So I also thought this is a mistake within the documentation, which I raised in the doc issue. But it is not a mistake.

We need to additionally import the Umbraco.Core namespace, which then includes the extension methods for the compositions, which on the other hand implement the method composition.RegisterUnique<IUmbracoIndexConfig, CustomIndexConfig>(); without "for".

Visual Studio without the Umbraco.Core namespace imported suggests to use the RegisterUniqueFor method instead, which is wrong: image

Many thanks!!

nul800sebastiaan commented 4 years ago

@nul800sebastiaan you are my HERO! This is it, it's working exactly like i needed. A monster high 5er!

Thanks! Hint, I like craft beer 😉

The docs are correct, but as you noted, we should add usings. Anyway, I'll close this issue, good luck with the launch!

mcl-sz commented 4 years ago

Hi @nul800sebastiaan,

Since I've been told: don't add new indexes

Do you mean it is best practice to not create a custom memberindex at all? We use a customer index for extra fields. It is better to use only the buildin Memberindex?

Shazwazza commented 3 years ago

The less indexes the better. There is typically no reason to create extra indexes unless you are doing something extremely custom. Why doesn't the built in one suit your needs?

stefankip commented 3 years ago

With the default members index I was running into this issue: https://our.umbraco.com/forum/using-umbraco-and-getting-started/105368-customize-examine-index-not-being-used-on-boot