umbraco / Umbraco-CMS

Umbraco is a free and open source .NET content management system helping you deliver delightful digital experiences.
https://umbraco.com
MIT License
4.49k stars 2.69k forks source link

v8 Indexing issue causing AppDomain to crash #3683

Closed Shazwazza closed 5 years ago

Shazwazza commented 6 years ago

This exception occurs on a background thread which takes down the whole app domain.

We need to debug into why this is happening.

The error in the logs is:

2018-09-10 15:11:43,841 [P1948/D3/T29] ERROR  Umbraco.Core.UmbracoApplicationBase - Unhandled exception in AppDomain (terminating).
Lucene.Net.Index.MergePolicy+MergeException: Exception of type 'Lucene.Net.Index.MergePolicy+MergeException' was thrown. ---> System.IO.IOException: Cannot overwrite: C:\Users\Shannon\Documents\_Projects\Umbraco\Umbraco_8.0\src\Umbraco.Web.UI\App_Data\TEMP\ExamineIndexes\Internal\Index\_11l.tis
   at Lucene.Net.Index.IndexWriter.HandleMergeException(Exception t, OneMerge merge) in d:\Lucene.Net\FullRepo\trunk\src\core\Index\IndexWriter.cs:line 4641
   at Lucene.Net.Index.IndexWriter.Merge(OneMerge merge) in d:\Lucene.Net\FullRepo\trunk\src\core\Index\IndexWriter.cs:line 4674
   at Lucene.Net.Index.ConcurrentMergeScheduler.MergeThread.Run() in d:\Lucene.Net\FullRepo\trunk\src\core\Index\ConcurrentMergeScheduler.cs:line 370
   --- End of inner exception stack trace ---
   at Lucene.Net.Index.ConcurrentMergeScheduler.HandleMergeException(Exception exc) in d:\Lucene.Net\FullRepo\trunk\src\core\Index\ConcurrentMergeScheduler.cs:line 430
   at Lucene.Net.Index.ConcurrentMergeScheduler.MergeThread.Run() in d:\Lucene.Net\FullRepo\trunk\src\core\Index\ConcurrentMergeScheduler.cs:line 394
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()

not sure how to reproduce but i get this once in a while when i restart the app pool by bumping the web.config

clausjensen commented 5 years ago

I haven't seen this one and the issue was created 5 months ago - I'll close for now and we can reopen if it still happens?

mszwi commented 5 years ago

Looks like I'm also getting this error. The site has crashed multiple times in the last few months, I've upgraded to latest version. Anything I can do to help debug this? Can't manage to reproduce either. But last time i happened after replacing a bunch of files on the server after changes. (including web.config)

{"@t":"2019-09-26T13:04:33.2641294Z","@mt":"Unhandled exception in AppDomain (terminating).","@l":"Error","@x":"Lucene.Net.Index.MergePolicy+MergeException: Exception of type 'Lucene.Net.Index.MergePolicy+MergeException' was thrown. ---> Lucene.Net.Index.CorruptIndexException: doc counts differ for segment _l2: fieldsReader shows 244 but segmentInfo shows 242\r\n at Lucene.Net.Index.IndexWriter.HandleMergeException(Exception t, OneMerge merge) in d:\\Lucene.Net\\FullRepo\\trunk\\src\\core\\Index\\IndexWriter.cs:line 4641\r\n at Lucene.Net.Index.IndexWriter.Merge(OneMerge merge) in d:\\Lucene.Net\\FullRepo\\trunk\\src\\core\\Index\\IndexWriter.cs:line 4674\r\n at Lucene.Net.Index.ConcurrentMergeScheduler.MergeThread.Run() in d:\\Lucene.Net\\FullRepo\\trunk\\src\\core\\Index\\ConcurrentMergeScheduler.cs:line 370\r\n --- End of inner exception stack trace ---\r\n at Lucene.Net.Index.ConcurrentMergeScheduler.HandleMergeException(Exception exc) in d:\\Lucene.Net\\FullRepo\\trunk\\src\\core\\Index\\ConcurrentMergeScheduler.cs:line 430\r\n at Lucene.Net.Index.ConcurrentMergeScheduler.MergeThread.Run() in d:\\Lucene.Net\\FullRepo\\trunk\\src\\core\\Index\\ConcurrentMergeScheduler.cs:line 394\r\n at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)\r\n at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)\r\n at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)\r\n at System.Threading.ThreadHelper.ThreadStart()","SourceContext":"Umbraco.Core.Runtime.CoreRuntime","ProcessId":25816,"ProcessName":"w3wp","ThreadId":18,"AppDomainId":2,"AppDomainAppId":"LMW3SVC49ROOT","MachineName":"WIN-RFGB09RCI22","Log4NetLevel":"ERROR"}

nickornotto commented 4 years ago

We've been getting the same issue for a few months now. I have described the issue here: https://stackoverflow.com/questions/59725408/unhandled-exception-in-appdomain-read-past-eof-error-in-lucene-indexing

It seems to happen when performing save on a content - either node or a media. I've just got it when tried to save an image content and lots of our client get this when trying to do the save of their products on our website.

We've got v 7.15.1 on Windows 2016 which has been recently updated.

Shazwazza commented 4 years ago

@nickornotto I've never seen this issue before. Are you load balancing? running the site of a network share? sharing the same file system between sites? etc... ? This is not a behavior I've ever seen so I can only guess that it's a site configuration type of thing.

nickornotto commented 4 years ago

No, we are not load balancing, we just have a dedidated single server. What kind of configuration? You @Shazwazza and @mszwi have reported it above so have you found the fix for yourselves?

I have quoted 2 type of erros in my stackoverflow post and both are happening under the same circumstances - when trying to save a content (no matter via CMS or the front site). One of them only is the same you mentioned above.

nickornotto commented 4 years ago

Hi, Is anything happening with this one? We investigated more and we first noticed this issue almost 1 year ago - back in April 2019. So we were still on version 7.6.0 back then.

However now it is happening much more often than before and it virtually stops us from doing most of content updates whether it's via CMS or the API.

As it is affecting all functionality of our site on frequent basis we consider it to be a very serious matter.

I will be grateful for any information if a progress has been made or is planned on this. Thank you.

Shazwazza commented 4 years ago

@nickornotto Do you have steps to replicate this issue from a vanilla Umbraco install? I guess I had seen this error before since I reported it 🤦‍♂ but i have never seen it since so I'm unsure how to make it happen and without replication steps it's pretty hard to identify an issue. I also believe I was reporting this against the v8 development which has changed dramatically since then.

nickornotto commented 4 years ago

@Shazwazza I have no idea. We didn't try it with another umbraco instance. We've got one more site on the same server but it's in Umbraco 6 and rarely used. I never saw this problem there though. We've had this problem in version 7.6.0 before we upgraded to 7.15.1 - if it persists in v8 for you then it's bad. We changed the hosting in Sep 2018 from shared (Win 2012) to dedicated (Win 2016) and I don't recall the problem happenning on the old server although I can't be 100% as the issue seems to be happening more and more often so we might have spot it late.

We are just experiencing the error now - we are trying and are unable to publish one of our blog posts. It looks published in the CMS while on the public site every macro containing it crashes.

The error pruduced:

 2020-01-28 08:28:34,217 [P46372/D2/T13] INFO  Umbraco.Core.Publishing.PublishingStrategy - Content '6 Religious Festivals in India that Bring us Together' with Id '20855' has been published.
 2020-01-28 08:28:37,248 [P46372/D2/T11] ERROR Umbraco.Core.UmbracoApplicationBase - Unhandled exception in AppDomain (terminating)
System.IndexOutOfRangeException: Index of bound 6
   at Lucene.Net.Util.BitVector.GetAndSet(Int32 bit)
   at Lucene.Net.Index.SegmentReader.DoDelete(Int32 docNum)
   at Lucene.Net.Index.IndexReader.DeleteDocument(Int32 docNum)
   at Lucene.Net.Index.DocumentsWriter.ApplyDeletes(IndexReader reader, Int32 docIDStart)
   at Lucene.Net.Index.DocumentsWriter.ApplyDeletes(SegmentInfos infos)
   at Lucene.Net.Index.IndexWriter.ApplyDeletes()
   at Lucene.Net.Index.IndexWriter.DoFlushInternal(Boolean flushDocStores, Boolean flushDeletes)
   at Lucene.Net.Index.IndexWriter.DoFlush(Boolean flushDocStores, Boolean flushDeletes)
   at Lucene.Net.Index.IndexWriter.PrepareCommit(IDictionary`2 commitUserData, Boolean internal_Renamed)
   at Lucene.Net.Index.IndexWriter.Commit(IDictionary`2 commitUserData)
   at Examine.LuceneEngine.Providers.LuceneIndexer.IndexCommiter.TimerRelease() in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 1711
   at Examine.LuceneEngine.Providers.LuceneIndexer.IndexCommiter.<ScheduleCommit>b__7_0(Object _) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 1663
   at System.Threading.TimerQueueTimer.CallCallbackInContext(Object state)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.TimerQueueTimer.CallCallback()
   at System.Threading.TimerQueueTimer.Fire()
   at System.Threading.TimerQueue.FireNextTimers()
   at System.Threading.TimerQueue.AppDomainTimerCallback(Int32 id)
 2020-01-28 08:28:40,983 [P3644/D2/T1] INFO  Umbraco.Core.CoreBootManager - Umbraco 7.15.1 application starting on WIN-M6...2S0HP3

It can crash the site upon publishing any other type of content though. I am more than happy to share with you our log file(s) to show the sequence of events.

We are experiecing some other lucene and app errors which may be related, eg. it has trouble to get a media file (eg. logo) - this kind of error started appearing exactly after midnight 1/1/2020 and is not the problem with the media as the media file works on one page but it can't be get on another.

nickornotto commented 4 years ago

I noticed another issue that looks exactly the same: https://github.com/umbraco/Umbraco-CMS/issues/3823 The author is not responding though.

Shazwazza commented 4 years ago

@nickornotto v8 development and examine/lucene is very different from v7 so when i saw the error I think it was part of that development process.

Are there other issues that you are experiencing on your site? Do you have constant restarts in your log?

Also can you please double check that you are running with the latest Examine version for v7 https://www.nuget.org/packages/Examine/0.1.90

To confirm though - you are saying that this problem actually started when you changed the hosting?

Is the file system on your new server on a network share?

That new stack trace is different from the others but at least that shows where an issue may arrise from but to me this sounds like an issue with overlapping appdomains (i.e. concurrent restarts). This is also an error I've definitely never seen. Also note that warnings in the log about media and examine is because your indexes our out of date and it will be querying the DB directly for media which is slow. Please rebuild your indexes.

Shazwazza commented 4 years ago

There's a few search results found with that new stack trace/error you posted, one on Our: https://our.umbraco.com/forum/using-umbraco-and-getting-started/94839-entire-site-has-a-503-error#comment-299964 which you can try, and another on the lucene.net mail lists which has questions which are essentially the same types of questions i would ask http://mail-archives.apache.org/mod_mbox/lucenenet-user/201406.mbox/%3C53A11E96.5090801@devhost.se%3E

nickornotto commented 4 years ago

@Shazwazza

In case of get_Media error - you say we should rebuild the index - does the indes is not being rebuild automatically when you update a content in the CMS? We also do get similar get_Media error when a user is trying to update his images on the website (via API). Shouldn't the API rebuild the index on publish?

Shazwazza commented 4 years ago

In case of get_Media error - you say we should rebuild the index - does the indes is not being rebuild automatically when you update a content in the CMS? We also do get similar get_Media error when a user is trying to update his images on the website (via API). Shouldn't the API rebuild the index on publish?

Indexes don't ever rebuild unless they are empty. Indexes stay up to date with the data in Umbraco automatically. Pretty much the only reason why they will get out of sync is if you have constant restart issues - you didn't get back to me on that, are you seeing constant restarts in your logs?

Most important question - can you replicate this problem on your local machine since local replication is the only way we'll ever figure this out. (i.e. steps to replicate)

nickornotto commented 4 years ago

@Shazwazza No, we do not have unexpected server restarts. Sorry if I missed that before.

These are just logs from today:

2020-02-06 19:06:58.283 | 1 | ERROR | Umbraco.Web.ExamineStartup | The index PublishedProductsIndexer is rebuilding due to being unreadable/corrupt
-- | -- | -- | -- | --
2020-02-06 19:06:58.267 | 1 | ERROR | Umbraco.Web.ExamineStartup | The index PublishedContentIndexer is rebuilding due to being unreadable/corrupt
2020-02-06 19:06:58.250 | 1 | ERROR | Umbraco.Web.ExamineStartup | The index ExternalIndexer is rebuilding due to being unreadable/corrupt
2020-02-06 19:06:58.157 | 1 | ERROR | Umbraco.Web.ExamineStartup | The index SupplierProductsIndexer is rebuilding due to being unreadable/corrupt
2020-02-06 18:18:27.307 | 120 | ERROR | Umbraco.Web.WebApi.Filters.AngularAntiForgeryHelper | Could not validate XSRF token
2020-02-06 14:20:31.750 | 96 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=SupplierBookingsIndexer, NodeId=-1
2020-02-06 14:20:31.750 | 130 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=InternalIndexer, NodeId=-1
2020-02-06 14:20:31.750 | 136 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=SupplierProductsIndexer, NodeId=-1
2020-02-06 14:20:21.543 | 100 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=SupplierBookingsIndexer, NodeId=-1
2020-02-06 14:20:21.543 | 10 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=InternalIndexer, NodeId=-1
2020-02-06 14:20:21.543 | 90 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=SupplierProductsIndexer, NodeId=-1
2020-02-06 10:56:26.447 | 19 | ERROR | Umbraco.Core.UmbracoApplicationBase | An unhandled exception occurred
2020-02-06 09:50:58.930 | 16 | ERROR | Umbraco.Core.UmbracoApplicationBase | An unhandled exception occurred
2020-02-06 08:42:44.907 | 30 | ERROR | umbraco.content | Failed to save Xml to file.
2020-02-06 08:42:39.547 | 28 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=PublishedProductsIndexer, NodeId=-1
2020-02-06 08:42:39.547 | 33 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=ExternalIndexer, NodeId=-1
2020-02-06 08:42:39.547 | 24 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=PublishedContentIndexer, NodeId=-1
2020-02-06 08:42:39.547 | 12 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=LocationsIndexer, NodeId=-1
2020-02-06 08:42:39.560 | 31 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=SupplierIndexer, NodeId=-1
2020-02-06 08:42:39.547 | 32 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=ReviewsIndexer, NodeId=-1
2020-02-06 08:42:39.187 | 35 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=SupplierProductsIndexer, NodeId=-1
2020-02-06 08:42:39.187 | 22 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=SupplierBookingsIndexer, NodeId=-1
2020-02-06 08:42:39.187 | 30 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=InternalIndexer, NodeId=-1
2020-02-06 06:49:55.003 | 1 | ERROR | Umbraco.Web.ExamineStartup | The index InternalIndexer is rebuilding due to being unreadable/corrupt
2020-02-06 06:44:04.623 | 152 | ERROR | Umbraco.Core.UmbracoApplicationBase | An unhandled exception occurred
2020-02-06 05:35:17.840 | 5 | ERROR | Umbraco.Core.UmbracoApplicationBase | An unhandled exception occurred
2020-02-06 05:35:12.043 | 13 | ERROR | Umbraco.Core.UmbracoApplicationBase | An unhandled exception occurred
2020-02-06 06:44:17.437 | 1 | ERROR | Umbraco.Web.ExamineStartup | The index InternalIndexer is rebuilding due to being unreadable/corrupt
2020-02-06 05:37:58.473 | 22 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=SupplierIndexer, NodeId=-1
2020-02-06 05:37:58.347 | 12 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=ReviewsIndexer, NodeId=-1
2020-02-06 05:37:58.347 | 14 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=PublishedContentIndexer, NodeId=-1
2020-02-06 05:37:58.347 | 13 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=PublishedProductsIndexer, NodeId=-1
2020-02-06 05:37:58.347 | 21 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=ExternalIndexer, NodeId=-1
2020-02-06 05:37:58.347 | 24 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=SupplierBookingsIndexer, NodeId=-1
2020-02-06 05:37:58.347 | 19 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=LocationsIndexer, NodeId=-1
2020-02-06 05:37:58.347 | 22 | ERROR | UmbracoExamine.DataServices.UmbracoLogService | Provider=SupplierProductsIndexer, NodeId=-1
2020-02-06 05:33:26.277 | 1 | ERROR | Umbraco.Web.ExamineStartup | The index InternalIndexer is rebuilding due to being unreadable/corrupt
2020-02-06 05:23:19.620 | 1 | ERROR | Umbraco.Web.ExamineStartup | The index InternalIndexer is rebuilding due to being unreadable/corrupt
2020-02-06 04:46:21.673 | 1 | ERROR | Umbraco.Web.ExamineStartup | The index InternalIndexer is rebuilding due to being unreadable/corrupt
2020-02-06 03:58:58.033 | 1 | ERROR | Umbraco.Web.ExamineStartup | The index InternalIndexer is rebuilding due to being unreadable/corrupt

We checked test site (on the same server) and there are no such error there but this may be because it is rarely used. So replicating it on a site that is not used would be very difficult. We did not experience it on localhost ever.

nickornotto commented 4 years ago

Sorry, update to the above: The above seem to be the result of the AppDomain crash, the sequence from 19:06 is like this:

The first list (in the previous comment) was from the db log, these here are from the event log and umbraco log file. Not sure why the db does not log every error.

From the server log: 19:06:37

Application: w3wp.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.IO.IOException
   at Lucene.Net.Store.FSDirectory.InitOutput(System.String)
   at Lucene.Net.Store.SimpleFSDirectory.CreateOutput(System.String)
   at Lucene.Net.Index.CompoundFileWriter.Close()
   at Lucene.Net.Index.DocumentsWriter.CreateCompoundFile(System.String)
   at Lucene.Net.Index.IndexWriter.DoFlushInternal(Boolean, Boolean)
   at Lucene.Net.Index.IndexWriter.DoFlush(Boolean, Boolean)
   at Lucene.Net.Index.IndexWriter.PrepareCommit(System.Collections.Generic.IDictionary`2<System.String,System.String>, Boolean)
   at Lucene.Net.Index.IndexWriter.Commit(System.Collections.Generic.IDictionary`2<System.String,System.String>)
   at Examine.LuceneEngine.Providers.LuceneIndexer+IndexCommiter.TimerRelease()
   at Examine.LuceneEngine.Providers.LuceneIndexer+IndexCommiter.<ScheduleCommit>b__7_0(System.Object)
   at System.Threading.TimerQueueTimer.CallCallbackInContext(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.TimerQueueTimer.CallCallback()
   at System.Threading.TimerQueueTimer.Fire()
   at System.Threading.TimerQueue.FireQueuedTimerCompletion(System.Object)
   at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()

19:06:37

An unhandled exception occurred and the process was terminated.

Application ID: /LM/W3SVC/8/ROOT

Process ID: 31188

Exception: System.IO.FileNotFoundException

Message: Could not find file 'D:\Inetpub\ghyhost\mydomain.com\httpdocs\App_Data\TEMP\ExamineIndexes\SupplierProducts\Index\_9o_1.del'.

StackTrace:    at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share)
   at Lucene.Net.Store.SimpleFSDirectory.OpenInput(String name, Int32 bufferSize)
   at Lucene.Net.Store.FSDirectory.OpenInput(String name)
   at Lucene.Net.Util.BitVector..ctor(Directory d, String name)
   at Lucene.Net.Index.SegmentReader.LoadDeletedDocs()
   at Lucene.Net.Index.SegmentReader.Get(Boolean readOnly, Directory dir, SegmentInfo si, Int32 readBufferSize, Boolean doOpenStores, Int32 termInfosIndexDivisor)
   at Lucene.Net.Index.IndexWriter.ReaderPool.Get(SegmentInfo info, Boolean doOpenStores, Int32 readBufferSize, Int32 termsIndexDivisor)
   at Lucene.Net.Index.IndexWriter.ReaderPool.Get(SegmentInfo info, Boolean doOpenStores)
   at Lucene.Net.Index.DocumentsWriter.ApplyDeletes(SegmentInfos infos)
   at Lucene.Net.Index.IndexWriter.ApplyDeletes()
   at Lucene.Net.Index.IndexWriter.DoFlushInternal(Boolean flushDocStores, Boolean flushDeletes)
   at Lucene.Net.Index.IndexWriter.DoFlush(Boolean flushDocStores, Boolean flushDeletes)
   at Lucene.Net.Index.IndexWriter.PrepareCommit(IDictionary`2 commitUserData, Boolean internal_Renamed)
   at Lucene.Net.Index.IndexWriter.Commit(IDictionary`2 commitUserData)
   at Examine.LuceneEngine.Providers.LuceneIndexer.IndexCommiter.TimerRelease() in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 1711
   at Examine.LuceneEngine.Providers.LuceneIndexer.IndexCommiter.<ScheduleCommit>b__7_0(Object _) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 1663
   at System.Threading.TimerQueueTimer.CallCallbackInContext(Object state)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.TimerQueueTimer.CallCallback()
   at System.Threading.TimerQueueTimer.Fire()
   at System.Threading.TimerQueue.FireQueuedTimerCompletion(Object state)
   at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()

Then: 19:06:38

An unhandled exception occurred and the process was terminated.

Application ID: /LM/W3SVC/14/ROOT

Process ID: 46988

Exception: Lucene.Net.Index.MergePolicy+MergeException

Message: Exception of type 'Lucene.Net.Index.MergePolicy+MergeException' was thrown.

StackTrace:    at Lucene.Net.Index.ConcurrentMergeScheduler.HandleMergeException(Exception exc)
   at Lucene.Net.Index.ConcurrentMergeScheduler.MergeThread.Run()
   at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()

InnerException: Lucene.Net.Index.CorruptIndexException

Message: doc counts differ for segment _9q: fieldsReader shows 1 but segmentInfo shows 4

StackTrace:    at Lucene.Net.Index.IndexWriter.HandleMergeException(Exception t, OneMerge merge)
   at Lucene.Net.Index.IndexWriter.Merge(OneMerge merge)
   at Lucene.Net.Index.ConcurrentMergeScheduler.DoMerge(OneMerge merge)
   at Lucene.Net.Index.ConcurrentMergeScheduler.MergeThread.Run()

The umbraco log file:

 2020-02-06 19:06:38,856 [P46988/D2/TLucene Merge Thread #0] ERROR Umbraco.Core.UmbracoApplicationBase - Unhandled exception in AppDomain (terminating)
Lucene.Net.Index.MergePolicy+MergeException: Exception of type 'Lucene.Net.Index.MergePolicy+MergeException' was thrown. ---> Lucene.Net.Index.CorruptIndexException: doc counts differ for segment _9q: fieldsReader shows 1 but segmentInfo shows 4
   at Lucene.Net.Index.IndexWriter.HandleMergeException(Exception t, OneMerge merge)
   at Lucene.Net.Index.IndexWriter.Merge(OneMerge merge)
   at Lucene.Net.Index.ConcurrentMergeScheduler.DoMerge(OneMerge merge)
   at Lucene.Net.Index.ConcurrentMergeScheduler.MergeThread.Run()
   --- End of inner exception stack trace ---
   at Lucene.Net.Index.ConcurrentMergeScheduler.HandleMergeException(Exception exc)
   at Lucene.Net.Index.ConcurrentMergeScheduler.MergeThread.Run()
   at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()
 2020-02-06 19:06:53,157 [P53964/D2/T1] INFO  Umbraco.Core.CoreBootManager - Umbraco 7.15.1 application starting on WIN-M6SHD2S0HP3
 2020-02-06 19:06:53,344 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Determining hash of code files on disk
 2020-02-06 19:06:53,360 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Hash determined (took 8ms)
 2020-02-06 19:06:53,422 [P53964/D2/T1] INFO  Umbraco.Core.MainDom - Acquiring MainDom...
 2020-02-06 19:06:53,422 [P53964/D2/T1] INFO  Umbraco.Core.MainDom - Acquired MainDom.
 2020-02-06 19:06:53,422 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving umbraco.interfaces.IDiscoverable
 2020-02-06 19:06:53,704 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved umbraco.interfaces.IDiscoverable (took 277ms)
 2020-02-06 19:06:53,704 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving umbraco.interfaces.IApplicationStartupHandler
 2020-02-06 19:06:53,704 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved umbraco.interfaces.IApplicationStartupHandler (took 3ms)
 2020-02-06 19:06:53,782 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving umbraco.interfaces.IDiscoverable
 2020-02-06 19:06:53,782 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved umbraco.interfaces.IDiscoverable (took 1ms)
 2020-02-06 19:06:53,782 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving Umbraco.Core.PropertyEditors.IPropertyEditorValueConverter
 2020-02-06 19:06:53,782 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved Umbraco.Core.PropertyEditors.IPropertyEditorValueConverter (took 1ms)
 2020-02-06 19:06:53,782 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving umbraco.interfaces.IDiscoverable
 2020-02-06 19:06:53,782 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved umbraco.interfaces.IDiscoverable (took 0ms)
 2020-02-06 19:06:53,782 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving Umbraco.Core.PropertyEditors.IPropertyValueConverter
 2020-02-06 19:06:53,797 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved Umbraco.Core.PropertyEditors.IPropertyValueConverter (took 4ms)
 2020-02-06 19:06:53,813 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving umbraco.interfaces.IDiscoverable
 2020-02-06 19:06:53,813 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved umbraco.interfaces.IDiscoverable (took 1ms)
 2020-02-06 19:06:53,813 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving Umbraco.Web.Mvc.SurfaceController
 2020-02-06 19:06:53,813 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved Umbraco.Web.Mvc.SurfaceController (took 1ms)
 2020-02-06 19:06:53,813 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving umbraco.interfaces.IDiscoverable
 2020-02-06 19:06:53,813 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved umbraco.interfaces.IDiscoverable (took 0ms)
 2020-02-06 19:06:53,813 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving Umbraco.Web.WebApi.UmbracoApiController
 2020-02-06 19:06:53,813 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved Umbraco.Web.WebApi.UmbracoApiController (took 3ms)
 2020-02-06 19:06:57,470 [P53964/D2/T1] INFO  Umbraco.Core.DatabaseContext - CanConnect = True
 2020-02-06 19:06:57,720 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving Umbraco.Core.Models.PublishedContent.PublishedContentModel
 2020-02-06 19:06:57,751 [P53964/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved Umbraco.Core.Models.PublishedContent.PublishedContentModel (took 31ms)
 2020-02-06 19:06:57,907 [P53964/D2/T1] INFO  Umbraco.Web.Cache.CacheRefresherEventHandler - Initializing Umbraco internal event handlers for cache refreshing
 2020-02-06 19:06:57,954 [P53964/D2/T1] INFO  Umbraco.Forms.Core.Cache.CacheEventHandler - Initializing Umbraco Forms Events to trigger Cache Refreshers
 2020-02-06 19:06:58,001 [P53964/D2/T1] INFO  Umbraco.Web.Search.ExamineEvents - Initializing Examine and binding to business logic events
 2020-02-06 19:06:58,157 [P53964/D2/T1] ERROR Umbraco.Web.ExamineStartup - The index SupplierProductsIndexer is rebuilding due to being unreadable/corrupt
Lucene.Net.Index.CorruptIndexException: doc counts differ for segment _9q: fieldsReader shows 1 but segmentInfo shows 4
   at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
   at Lucene.Net.Index.DirectoryReader.Open(Directory directory, IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly, Int32 termInfosIndexDivisor)
   at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean readOnly)
   at Examine.LuceneEngine.Providers.LuceneIndexer.IsReadable(Exception& ex) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 962
 2020-02-06 19:06:58,251 [P53964/D2/T1] ERROR Umbraco.Web.ExamineStartup - The index ExternalIndexer is rebuilding due to being unreadable/corrupt
Lucene.Net.Index.CorruptIndexException: doc counts differ for segment _6v: fieldsReader shows 1 but segmentInfo shows 8
   at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
   at Lucene.Net.Index.DirectoryReader.Open(Directory directory, IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly, Int32 termInfosIndexDivisor)
   at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean readOnly)
   at Examine.LuceneEngine.Providers.LuceneIndexer.IsReadable(Exception& ex) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 962
 2020-02-06 19:06:58,267 [P53964/D2/T1] ERROR Umbraco.Web.ExamineStartup - The index PublishedContentIndexer is rebuilding due to being unreadable/corrupt
Lucene.Net.Index.CorruptIndexException: doc counts differ for segment _1h: fieldsReader shows 1 but segmentInfo shows 5
   at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
   at Lucene.Net.Index.DirectoryReader.Open(Directory directory, IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly, Int32 termInfosIndexDivisor)
   at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean readOnly)
   at Examine.LuceneEngine.Providers.LuceneIndexer.IsReadable(Exception& ex) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 962
 2020-02-06 19:06:58,282 [P53964/D2/T1] ERROR Umbraco.Web.ExamineStartup - The index PublishedProductsIndexer is rebuilding due to being unreadable/corrupt
Lucene.Net.Index.CorruptIndexException: doc counts differ for segment _o: fieldsReader shows 1 but segmentInfo shows 3
   at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
   at Lucene.Net.Index.DirectoryReader.Open(Directory directory, IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly, Int32 termInfosIndexDivisor)
   at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean readOnly)
   at Examine.LuceneEngine.Providers.LuceneIndexer.IsReadable(Exception& ex) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 962
 2020-02-06 19:06:58,282 [P53964/D2/T1] INFO  Umbraco.Web.Search.ExamineEvents - Adding examine event handlers for index providers: 10

Also on the live site (where the problem occurs) it happens only every some time, not all the time. We sometimes need to spend some time to replicate it. But it's still too frequent so it prevents normal functionning.

Shazwazza commented 4 years ago

To me this still all indicates an environment issue or custom code. I notice you have SupplierProductsIndexer... what custom Examine and/or lucene code do you have running? You haven't mentioned any customizations that you have done which could lead to an issue. Without being able to replicate the problem it's near impossible to tell you what the problem could be especially since this problem is not seen on any other sites. Does your hosting environment run real time backups (or any backups at all)? Can you tell them to ignore lucene indexes? Do they run anti-virus software? Can you tell them to ignore Lucene indexes?

nickornotto commented 4 years ago

Oh sorry, I said somewhere that the same problem is with built in and our own indexes - not sure it was in this thread here though. We don't have any anti-virus software just using Windows Defender and also we excluded the App_Data files from the backup but it had no effect unfortunately.

nickornotto commented 4 years ago

The upgrade to 7.15.3 did not change anything.

nickornotto commented 4 years ago

We are still experiencing the issue in various forms: Lucene index erros keep appearing as MergeException or missing index file or even not finding the Lucene.Net assembly.

We are now suspecting that this is due to our configuration of the site and hostnames as per my post on the Umbraco forum: https://our.umbraco.com/forum/using-umbraco-and-getting-started/102097-whats-the-best-umbraco-websites-subdomains-and-indexes-configuration-to-avoid-lucene-errors

Can anyone (@Shazwazza ?) review the configuration and see if there is anything which could potentially prevent indexes functionning normally? Thank you