OrchardCMS / OrchardCore

Orchard Core is an open-source modular and multi-tenant application framework built with ASP.NET Core, and a content management system (CMS) built on top of that framework.
https://orchardcore.net
BSD 3-Clause "New" or "Revised" License
7.45k stars 2.41k forks source link

Lucene: lock obtain timed out #12786

Open PiemP opened 2 years ago

PiemP commented 2 years ago

Describe the bug

This happen randomly on our production server when users save a content type that are indexed by News index. Our production server is a little busy (abuot 19 of 20 GB of RAM are busy): I suppose this could be related to the issue but I hope to have some confirm about it.

This cause miss updating fo the index.

Here the stack trace:

2022-11-09 11:03:30.1740-ERROR|Vione|00-03150558224f3a55ca1ceaaf6a1d6a89-ae1974f89cb00c36-00||OrchardCore.Environment.Shell.Scope.ShellScope|Error while processing deferred task 'System.Func`2[[OrchardCore.Environment.Shell.Scope.ShellScope, OrchardCore.Abstractions, Version=1.4.0.0, Culture=neutral, PublicKeyToken=null],[System.Threading.Tasks.Task, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]' on tenant '...'. Lucene.Net.Store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@.................\Lucene\News\write.lock: System.IO.IOException: The process cannot access the file because another process has locked a portion of the file. : '.............\Lucene\News\write.lock'
   at System.IO.Strategies.FileStreamHelpers.Lock(SafeFileHandle handle, Boolean canWrite, Int64 position, Int64 length)
   at Lucene.Net.Store.NativeFSLock.Obtain()
 ---> System.IO.IOException: The process cannot access the file because another process has locked a portion of the file. : '.......\Lucene\News\write.lock'
   at System.IO.Strategies.FileStreamHelpers.Lock(SafeFileHandle handle, Boolean canWrite, Int64 position, Int64 length)
   at Lucene.Net.Store.NativeFSLock.Obtain()
   --- End of inner exception stack trace ---
   at Lucene.Net.Store.Lock.Obtain(Int64 lockWaitTimeout)
   at Lucene.Net.Index.IndexWriter..ctor(Directory d, IndexWriterConfig conf)
   at OrchardCore.Lucene.LuceneIndexManager.WriteAsync(String indexName, Action`1 action, Boolean close)
   at OrchardCore.Lucene.LuceneIndexManager.DeleteDocumentsAsync(String indexName, IEnumerable`1 contentItemIds)
   at OrchardCore.Lucene.Handlers.LuceneIndexingContentHandler.IndexingAsync(ShellScope scope, IEnumerable`1 contexts)
   at OrchardCore.Environment.Shell.Scope.ShellScope.<>c__DisplayClass53_0.<<BeforeDisposeAsync>b__0>d.MoveNext()    at Lucene.Net.Store.Lock.Obtain(Int64 lockWaitTimeout)
   at Lucene.Net.Index.IndexWriter..ctor(Directory d, IndexWriterConfig conf)
   at OrchardCore.Lucene.LuceneIndexManager.WriteAsync(String indexName, Action`1 action, Boolean close)
   at OrchardCore.Lucene.LuceneIndexManager.DeleteDocumentsAsync(String indexName, IEnumerable`1 contentItemIds)
   at OrchardCore.Lucene.Handlers.LuceneIndexingContentHandler.IndexingAsync(ShellScope scope, IEnumerable`1 contexts)
   at OrchardCore.Environment.Shell.Scope.ShellScope.<>c__DisplayClass53_0.<<BeforeDisposeAsync>b__0>d.MoveNext()

To Reproduce

  1. have a busy server (I suppose)
  2. create a content type
  3. create a lucene index
  4. add the content type to the new index and to the index used in "Default search index"

Expected behavior

no lock

Thank you for every support.

Skrypt commented 2 years ago

How long does it take to index all your content? And also, how is the background task set?

PiemP commented 2 years ago

How long does it take to index all your content? And also, how is the background task set?

After kill the handle on the write.lock file I have rebuild the index on the site and it doesn't take much time...about five seconds...I don't have a lot of content items on this content type but I have a lot of tenants on the same OrchardCore installation that do more or less the same things.

I don't know the kind of the background task that are called: on some tenants I have a workflow that is triggered when I publish an item of this content type but on other tenants I don't have this workflow and the issue happen same. Could be the task that add a new published item to the News index? I have an YesSql index related to this content type but I don't believe this could be related to the issue.

Generally the problem happen at the same time on the News index and on the Search index: the Search index is the one used on the OrchardCore search page.

Skrypt commented 2 years ago

I've had this issue when I had Luke.NET locking the index files while looking at them but I would need to be able to repro. Need to understand the conditions that made this result happen.

Skrypt commented 2 years ago

Normally, even if the server ram usage is near its top it should work by swapping it to drives. I don't think it is related to the server itself. There is an external process that locks these files. It could be another service from a different app that uses the same Lucene folders at the same time. But that's not something related to what we do in OC.

It could be an Antivirus locking the files too.

douwinga commented 2 years ago

We are seeing the same issue in production on Azure App Services. We can easily reproduce the issue by publishing a content item that is indexed by Lucene, reloading the tenant, and then publishing the same record again. We cannot reproduce when running locally, only when running in Azure. This is in OrchardCore 1.0, but we have this app updated to 1.4 in QA and it still happens. Hopefully this info helps. We have taken a few stabs at figuring the issue out over the months with no success.

We assumed it was our code for Azure Service Bus subscriptions at first, but the code does a tenant reload when settings change, and that is the actual cause of the problem. We can repro by saving those settings or going to the tenant list and hitting the "Reload" button to trigger the reload.

Edit: Just reproduced it locally too. Published Content Item -> Reload Tenant button clicked -> Published Content Item again

jtkech commented 2 years ago

Maybe because your Azure plan is scaled out to use multiple instances sharing the same file system.

As I remember, to save resources we don't release lucene writers, assuming that only one instance is running or that multiple instances don't share the same file system, maybe we could change this, so that the write.lock file would be freed.

douwinga commented 2 years ago

We are running on a single instance in Azure. I edited my comment above to state that I was actually able to reproduce locally (debugging via vs) as well. Curious to see if @PiemP can determine if this is the same cause for his locking issues.

jtkech commented 2 years ago

Ah okay good to know, thanks

With the tenant-removal branch, after releasing a tenant I was able to delete lucene index files.

So your use case is interesting to debug, I will try to repro when I will have time.

PiemP commented 2 years ago

Edit: Just reproduced it locally too. Published Content Item -> Reload Tenant button clicked -> Published Content Item again

I have try to reproduce this issue in local and in my pre release environment but nothing: I can't repro.

In this moment seems the issue is gone: I have added some virtual CPUs to machine in the cloud where OrchardCore running, but I think to have do some progress by better configuring IIS application pools related to the OrchardCore installation, by checking and applying some of this configuration:

Advanced configuration of the ASP.NET Core Module and IIS

In my production environment I have 85 tenants and the RAM consumption grows by 700 MB every 24 hours. This situation goes better than before: the ISS tweaking seems have do some improvement about this behavior, seem more similar to the situations that I have with the 1.0 version and more easy to manage. I try to monitoring the situation to add some other tweaking (like add back the application pools recycling in a specific moment to free up the RAM without restart the server).

write.lock handle

With ProcessEplorer I can see the handlers over write.lock file. This still remains after the tweaking, but I don't have the Lucene.Net.Store.LockObtainFailedException anymore.

I'm not an expert, I try to do my best but I'm here for every testing purpose.

PS. this morning I haved another strange issue with IIS but is not related to OrchardCore: the sites connected with one application pool returned 502 error code. Restart the application pool have solved the issue. No errors on the OrchardCore logs.

douwinga commented 2 years ago

We have now tried a new pull of OC 1.4 and cannot repro, so the locks for us are probably specific to our code. Still interesting that PiemP is seeing the same error as us for likely different causes.

Skrypt commented 2 years ago

At this point, it looks more like multi-tenant initialization issues. Disposal of the Lucene services and lock. You guys should start considering upgrading up to 1.5 and see from there. Else, 1.5 also adds the ElasticSearch option.

sebastienros commented 2 years ago

Since it's happening in 1.4 it can't be due to the recent changes in the module. It could be because of Azure App Service filesystem (we have found issues by the past), or from a bug in Lucene.NET that might be fixed in 1.5 (the library was updated). Or finally an existing bug in OC but we haven't seen any report in a long time.

PiemP commented 1 year ago

Is happened again. Definitely is related to the RAM consumption: when the server RAM goes near the machine limit this error come out again.

2022-12-06 09:16:42.8767-ERROR|Tenant_Name|00-15166c34206440994406a58044fde98a-edaf19fe50493e06-00||OrchardCore.Environment.Shell.Scope.ShellScope|Error while processing deferred task 'System.Func`2[[OrchardCore.Environment.Shell.Scope.ShellScope, OrchardCore.Abstractions, Version=1.4.0.0, Culture=neutral, PublicKeyToken=null],[System.Threading.Tasks.Task, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]' on tenant 'Tenant_Name'. Lucene.Net.Store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@...\App_Data\Sites\Tenant_Name\Lucene\News\write.lock: System.IO.IOException: The process cannot access the file because another process has locked a portion of the file. : '...\App_Data\Sites\Tenant_Name\Lucene\News\write.lock'
   at System.IO.Strategies.FileStreamHelpers.Lock(SafeFileHandle handle, Boolean canWrite, Int64 position, Int64 length)
   at System.IO.Strategies.OSFileStreamStrategy.Lock(Int64 position, Int64 length)
   at System.IO.Strategies.BufferedFileStreamStrategy.Lock(Int64 position, Int64 length)
   at System.IO.FileStream.Lock(Int64 position, Int64 length)
   at Lucene.Net.Store.NativeFSLock.Obtain()
 ---> System.IO.IOException: The process cannot access the file because another process has locked a portion of the file. : '...\App_Data\Sites\Tenant_Name\Lucene\News\write.lock'
   at System.IO.Strategies.FileStreamHelpers.Lock(SafeFileHandle handle, Boolean canWrite, Int64 position, Int64 length)
   at System.IO.Strategies.OSFileStreamStrategy.Lock(Int64 position, Int64 length)
   at System.IO.Strategies.BufferedFileStreamStrategy.Lock(Int64 position, Int64 length)
   at System.IO.FileStream.Lock(Int64 position, Int64 length)
   at Lucene.Net.Store.NativeFSLock.Obtain()
   --- End of inner exception stack trace ---
   at Lucene.Net.Store.Lock.Obtain(Int64 lockWaitTimeout)
   at Lucene.Net.Index.IndexWriter..ctor(Directory d, IndexWriterConfig conf)
   at OrchardCore.Lucene.IndexWriterWrapper..ctor(Directory directory, IndexWriterConfig config)
   at OrchardCore.Lucene.LuceneIndexManager.WriteAsync(String indexName, Action`1 action, Boolean close)
   at OrchardCore.Lucene.LuceneIndexManager.DeleteDocumentsAsync(String indexName, IEnumerable`1 contentItemIds)
   at OrchardCore.Lucene.Handlers.LuceneIndexingContentHandler.IndexingAsync(ShellScope scope, IEnumerable`1 contexts)
   at OrchardCore.Environment.Shell.Scope.ShellScope.<>c__DisplayClass53_0.<<BeforeDisposeAsync>b__0>d.MoveNext()    at Lucene.Net.Store.Lock.Obtain(Int64 lockWaitTimeout)
   at Lucene.Net.Index.IndexWriter..ctor(Directory d, IndexWriterConfig conf)
   at OrchardCore.Lucene.IndexWriterWrapper..ctor(Directory directory, IndexWriterConfig config)
   at OrchardCore.Lucene.LuceneIndexManager.WriteAsync(String indexName, Action`1 action, Boolean close)
   at OrchardCore.Lucene.LuceneIndexManager.DeleteDocumentsAsync(String indexName, IEnumerable`1 contentItemIds)
   at OrchardCore.Lucene.Handlers.LuceneIndexingContentHandler.IndexingAsync(ShellScope scope, IEnumerable`1 contexts)
   at OrchardCore.Environment.Shell.Scope.ShellScope.<>c__DisplayClass53_0.<<BeforeDisposeAsync>b__0>d.MoveNext()

If I try to rebuild the index locked from the backend I recieve this error:

2022-12-06 11:21:57.1967-ERROR|Tenant_Name|00-fd582c49cdb2ec6f8675c1e160c6baab-7c169d0eb2a1a373-00||Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware|An unhandled exception has occurred while executing the request. Lucene.Net.Index.IndexNotFoundException: no segments* file found in MMapDirectory@...\App_Data\Sites\Tenant_Name\Lucene\News lockFactory=NativeFSLockFactory@...\App_Data\Sites\Tenant_Name\Lucene\News: files: [write.lock, _c8.fdt, _c8.nvd, _c8_Lucene41_0.doc, _c8_Lucene41_0.pos, _c8_Lucene41_0.tim, _cb.cfs, _cf.cfs]
   at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
   at Lucene.Net.Index.StandardDirectoryReader.Open(Directory directory, IndexCommit commit, Int32 termInfosIndexDivisor)
   at Lucene.Net.Index.DirectoryReader.Open(Directory directory)
   at OrchardCore.Lucene.LuceneIndexManager.<>c__DisplayClass24_0.<GetReader>b__0(String n)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey key, Func`2 valueFactory)
   at OrchardCore.Lucene.LuceneIndexManager.GetReader(String indexName)
   at OrchardCore.Lucene.LuceneIndexManager.SearchAsync(String indexName, Func`2 searcher)
   at OrchardCore.Lucene.LuceneQuerySource.ExecuteQueryAsync(Query query, IDictionary`2 parameters)
   at OrchardCore.Queries.Liquid.QueryFilter.ProcessAsync(FluidValue input, FilterArguments arguments, LiquidTemplateContext ctx)
   at Fluid.Ast.FilterExpression.EvaluateAsync(TemplateContext context)
   at Fluid.Ast.AssignStatement.<WriteToAsync>g__Awaited|7_0(ValueTask`1 task, TemplateContext context, String identifier)
   at Fluid.Parser.FluidTemplate.Awaited(ValueTask`1 task, TextWriter writer, TextEncoder encoder, TemplateContext context, IReadOnlyList`1 statements, Int32 startIndex)
   at OrchardCore.DisplayManagement.Liquid.LiquidViewTemplate.RenderAsync(RazorPage`1 page)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageCoreAsync(IRazorPage page, ViewContext context)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageAsync(IRazorPage page, ViewContext context, Boolean invokeViewStarts)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderAsync(ViewContext context)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.RenderPartialCoreAsync(String partialViewName, Object model, ViewDataDictionary viewData, TextWriter writer)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.PartialAsync(String partialViewName, Object model, ViewDataDictionary viewData)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.<ProcessAsync>g__Awaited|11_0(Task`1 task)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Zones.ZoneShapes.ContentZone(IDisplayHelper DisplayAsync, Object Shape, IShapeFactory ShapeFactory)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.<ProcessAsync>g__Awaited|11_0(Task`1 task)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at CallSite.Target(Closure , CallSite , Object )
   at AspNetCoreGeneratedDocument.Views_Widget.ExecuteAsync()
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageCoreAsync(IRazorPage page, ViewContext context)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageAsync(IRazorPage page, ViewContext context, Boolean invokeViewStarts)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderAsync(ViewContext context)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.RenderPartialCoreAsync(String partialViewName, Object model, ViewDataDictionary viewData, TextWriter writer)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.PartialAsync(String partialViewName, Object model, ViewDataDictionary viewData)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.<ProcessAsync>g__Awaited|11_0(Task`1 task)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at AspNetCoreGeneratedDocument.Views_FlowPart.ExecuteAsync()
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageCoreAsync(IRazorPage page, ViewContext context)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageAsync(IRazorPage page, ViewContext context, Boolean invokeViewStarts)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderAsync(ViewContext context)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.RenderPartialCoreAsync(String partialViewName, Object model, ViewDataDictionary viewData, TextWriter writer)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.PartialAsync(String partialViewName, Object model, ViewDataDictionary viewData)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.<ProcessAsync>g__Awaited|11_0(Task`1 task)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Zones.ZoneShapes.ContentZone(IDisplayHelper DisplayAsync, Object Shape, IShapeFactory ShapeFactory)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.<ProcessAsync>g__Awaited|11_0(Task`1 task)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Liquid.Filters.ShapeRenderFilter.<ProcessAsync>g__Awaited|2_0(Task`1 task)
   at Fluid.Ast.FilterExpression.EvaluateAsync(TemplateContext context)
   at Fluid.Ast.OutputStatement.<WriteToAsync>g__Awaited|7_0(ValueTask`1 t, TextWriter w, TextEncoder enc, TemplateContext ctx)
   at Fluid.Parser.FluidTemplate.Awaited(ValueTask`1 task, TextWriter writer, TextEncoder encoder, TemplateContext context, IReadOnlyList`1 statements, Int32 startIndex)
   at OrchardCore.DisplayManagement.Liquid.LiquidViewTemplate.RenderAsync(RazorPage`1 page)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageCoreAsync(IRazorPage page, ViewContext context)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageAsync(IRazorPage page, ViewContext context, Boolean invokeViewStarts)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderAsync(ViewContext context)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.RenderPartialCoreAsync(String partialViewName, Object model, ViewDataDictionary viewData, TextWriter writer)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.PartialAsync(String partialViewName, Object model, ViewDataDictionary viewData)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.<ProcessAsync>g__Awaited|11_0(Task`1 task)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at CallSite.Target(Closure , CallSite , Object )
   at AspNetCoreGeneratedDocument.Views_Item_Display.ExecuteAsync()
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageCoreAsync(IRazorPage page, ViewContext context)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageAsync(IRazorPage page, ViewContext context, Boolean invokeViewStarts)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderAsync(ViewContext context)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.ViewExecutor.ExecuteAsync(ViewContext viewContext, String contentType, Nullable`1 statusCode)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.ViewExecutor.ExecuteAsync(ViewContext viewContext, String contentType, Nullable`1 statusCode)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.ViewExecutor.ExecuteAsync(ActionContext actionContext, IView view, ViewDataDictionary viewData, ITempDataDictionary tempData, String contentType, Nullable`1 statusCode)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.ViewResultExecutor.ExecuteAsync(ActionContext context, ViewResult result)
   at Microsoft.AspNetCore.Mvc.ViewResult.ExecuteResultAsync(ActionContext context)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeNextResultFilterAsync>g__Awaited|30_0[TFilter,TFilterAsync](ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Rethrow(ResultExecutedContextSealed context)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.ResultNext[TFilter,TFilterAsync](State& next, Scope& scope, Object& state, Boolean& isCompleted)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeResultFilters>g__Awaited|28_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeNextResourceFilter>g__Awaited|25_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Rethrow(ResourceExecutedContextSealed context)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeFilterPipelineAsync>g__Awaited|20_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeAsync>g__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeAsync>g__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope)
   at Microsoft.AspNetCore.Routing.EndpointMiddleware.<Invoke>g__AwaitRequestTask|6_0(Endpoint endpoint, Task requestTask, ILogger logger)
   at OrchardCore.Apis.GraphQL.GraphQLMiddleware.Invoke(HttpContext context, IAuthorizationService authorizationService, IAuthenticationService authenticationService, ISchemaFactory schemaService)
   at Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)
   at SixLabors.ImageSharp.Web.Middleware.ImageSharpMiddleware.Invoke(HttpContext httpContext, Boolean retry)
   at OrchardCore.Liquid.ScriptsMiddleware.Invoke(HttpContext httpContext)
   at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Localization.RequestLocalizationMiddleware.Invoke(HttpContext context)
   at OrchardCore.Diagnostics.DiagnosticsStartupFilter.<>c__DisplayClass3_0.<<Configure>b__1>d.MoveNext()
--- End of stack trace from previous location ---
   at Microsoft.AspNetCore.Diagnostics.StatusCodePagesMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.<Invoke>g__Awaited|6_0(ExceptionHandlerMiddleware middleware, HttpContext context, Task task)    at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
   at Lucene.Net.Index.StandardDirectoryReader.Open(Directory directory, IndexCommit commit, Int32 termInfosIndexDivisor)
   at Lucene.Net.Index.DirectoryReader.Open(Directory directory)
   at OrchardCore.Lucene.LuceneIndexManager.<>c__DisplayClass24_0.<GetReader>b__0(String n)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey key, Func`2 valueFactory)
   at OrchardCore.Lucene.LuceneIndexManager.GetReader(String indexName)
   at OrchardCore.Lucene.LuceneIndexManager.SearchAsync(String indexName, Func`2 searcher)
   at OrchardCore.Lucene.LuceneQuerySource.ExecuteQueryAsync(Query query, IDictionary`2 parameters)
   at OrchardCore.Queries.Liquid.QueryFilter.ProcessAsync(FluidValue input, FilterArguments arguments, LiquidTemplateContext ctx)
   at Fluid.Ast.FilterExpression.EvaluateAsync(TemplateContext context)
   at Fluid.Ast.AssignStatement.<WriteToAsync>g__Awaited|7_0(ValueTask`1 task, TemplateContext context, String identifier)
   at Fluid.Parser.FluidTemplate.Awaited(ValueTask`1 task, TextWriter writer, TextEncoder encoder, TemplateContext context, IReadOnlyList`1 statements, Int32 startIndex)
   at OrchardCore.DisplayManagement.Liquid.LiquidViewTemplate.RenderAsync(RazorPage`1 page)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageCoreAsync(IRazorPage page, ViewContext context)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageAsync(IRazorPage page, ViewContext context, Boolean invokeViewStarts)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderAsync(ViewContext context)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.RenderPartialCoreAsync(String partialViewName, Object model, ViewDataDictionary viewData, TextWriter writer)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.PartialAsync(String partialViewName, Object model, ViewDataDictionary viewData)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.<ProcessAsync>g__Awaited|11_0(Task`1 task)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Zones.ZoneShapes.ContentZone(IDisplayHelper DisplayAsync, Object Shape, IShapeFactory ShapeFactory)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.<ProcessAsync>g__Awaited|11_0(Task`1 task)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at CallSite.Target(Closure , CallSite , Object )
   at AspNetCoreGeneratedDocument.Views_Widget.ExecuteAsync()
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageCoreAsync(IRazorPage page, ViewContext context)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageAsync(IRazorPage page, ViewContext context, Boolean invokeViewStarts)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderAsync(ViewContext context)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.RenderPartialCoreAsync(String partialViewName, Object model, ViewDataDictionary viewData, TextWriter writer)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.PartialAsync(String partialViewName, Object model, ViewDataDictionary viewData)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.<ProcessAsync>g__Awaited|11_0(Task`1 task)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at AspNetCoreGeneratedDocument.Views_FlowPart.ExecuteAsync()
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageCoreAsync(IRazorPage page, ViewContext context)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageAsync(IRazorPage page, ViewContext context, Boolean invokeViewStarts)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderAsync(ViewContext context)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.RenderPartialCoreAsync(String partialViewName, Object model, ViewDataDictionary viewData, TextWriter writer)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.PartialAsync(String partialViewName, Object model, ViewDataDictionary viewData)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.<ProcessAsync>g__Awaited|11_0(Task`1 task)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Zones.ZoneShapes.ContentZone(IDisplayHelper DisplayAsync, Object Shape, IShapeFactory ShapeFactory)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.<ProcessAsync>g__Awaited|11_0(Task`1 task)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Liquid.Filters.ShapeRenderFilter.<ProcessAsync>g__Awaited|2_0(Task`1 task)
   at Fluid.Ast.FilterExpression.EvaluateAsync(TemplateContext context)
   at Fluid.Ast.OutputStatement.<WriteToAsync>g__Awaited|7_0(ValueTask`1 t, TextWriter w, TextEncoder enc, TemplateContext ctx)
   at Fluid.Parser.FluidTemplate.Awaited(ValueTask`1 task, TextWriter writer, TextEncoder encoder, TemplateContext context, IReadOnlyList`1 statements, Int32 startIndex)
   at OrchardCore.DisplayManagement.Liquid.LiquidViewTemplate.RenderAsync(RazorPage`1 page)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageCoreAsync(IRazorPage page, ViewContext context)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageAsync(IRazorPage page, ViewContext context, Boolean invokeViewStarts)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderAsync(ViewContext context)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.RenderPartialCoreAsync(String partialViewName, Object model, ViewDataDictionary viewData, TextWriter writer)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.HtmlHelper.PartialAsync(String partialViewName, Object model, ViewDataDictionary viewData)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.<ProcessAsync>g__Awaited|11_0(Task`1 task)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at OrchardCore.DisplayManagement.Implementation.DefaultHtmlDisplay.ExecuteAsync(DisplayContext context)
   at CallSite.Target(Closure , CallSite , Object )
   at AspNetCoreGeneratedDocument.Views_Item_Display.ExecuteAsync()
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageCoreAsync(IRazorPage page, ViewContext context)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageAsync(IRazorPage page, ViewContext context, Boolean invokeViewStarts)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderAsync(ViewContext context)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.ViewExecutor.ExecuteAsync(ViewContext viewContext, String contentType, Nullable`1 statusCode)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.ViewExecutor.ExecuteAsync(ViewContext viewContext, String contentType, Nullable`1 statusCode)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.ViewExecutor.ExecuteAsync(ActionContext actionContext, IView view, ViewDataDictionary viewData, ITempDataDictionary tempData, String contentType, Nullable`1 statusCode)
   at Microsoft.AspNetCore.Mvc.ViewFeatures.ViewResultExecutor.ExecuteAsync(ActionContext context, ViewResult result)
   at Microsoft.AspNetCore.Mvc.ViewResult.ExecuteResultAsync(ActionContext context)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeNextResultFilterAsync>g__Awaited|30_0[TFilter,TFilterAsync](ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Rethrow(ResultExecutedContextSealed context)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.ResultNext[TFilter,TFilterAsync](State& next, Scope& scope, Object& state, Boolean& isCompleted)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeResultFilters>g__Awaited|28_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeNextResourceFilter>g__Awaited|25_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Rethrow(ResourceExecutedContextSealed context)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeFilterPipelineAsync>g__Awaited|20_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeAsync>g__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeAsync>g__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope)
   at Microsoft.AspNetCore.Routing.EndpointMiddleware.<Invoke>g__AwaitRequestTask|6_0(Endpoint endpoint, Task requestTask, ILogger logger)
   at OrchardCore.Apis.GraphQL.GraphQLMiddleware.Invoke(HttpContext context, IAuthorizationService authorizationService, IAuthenticationService authenticationService, ISchemaFactory schemaService)
   at Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)
   at SixLabors.ImageSharp.Web.Middleware.ImageSharpMiddleware.Invoke(HttpContext httpContext, Boolean retry)
   at OrchardCore.Liquid.ScriptsMiddleware.Invoke(HttpContext httpContext)
   at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Localization.RequestLocalizationMiddleware.Invoke(HttpContext context)
   at OrchardCore.Diagnostics.DiagnosticsStartupFilter.<>c__DisplayClass3_0.<<Configure>b__1>d.MoveNext()
--- End of stack trace from previous location ---
   at Microsoft.AspNetCore.Diagnostics.StatusCodePagesMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.<Invoke>g__Awaited|6_0(ExceptionHandlerMiddleware middleware, HttpContext context, Task task)

Hope 1.5.0 do the miracle.

sebastienros commented 1 year ago

when the server RAM goes near the machine limit this error come out again

this points to a bug in lucene when it's under memory pressure

Can you try to create a console app that would do the same thing and "under memory pressure" by allocating dumb big arrays?

sebastienros commented 1 year ago

Found this issue where it mentions using SimpleFSDirectory instead of the defauklt memorymap one. Can you try that?

https://github.com/apache/lucenenet/issues/608

PiemP commented 1 year ago

Thank you for your effort. I hope try to do some tests on monday to replicate the issue and then I try to use the workaround.

PiemP commented 1 year ago

I have made some test but I'm really new to Lucene.NET. Before start to test SimpleFSDirectory I have try to repro the issue with a virtual machine where I have installed Windows Server 2019 with less than one GB of RAM. After that I have launch an application that write and read from an index and I have see RAM doesn't grow up over some limits and doesn't saturate system memory.

The next test I have done is to lauch an app that write on an index and an app that read from the same index and I have immediately repro the Lucene.Net.Store.LockObtainFailedException without full the RAM.

Now I try to some some search to understand how I can better test it because the tests that I have done seems to "dumb".

jtkech commented 1 year ago

Yes, if you have 2 processes using the same file system you may have a lock issue, idem on an azure plan using multiple instance.

But reading the above comments, looks like it may also happen on azure with only one instance.

Hmm maybe sometimes the app is recycled aggressively, for example if under memory pressure, and the lock file is not released correctly before another instance is re-started.

But not sure, just thinking out of loud ;)

PiemP commented 1 year ago

Hmm maybe sometimes the app is recycled aggressively, for example if under memory pressure, and the lock file is not released correctly before another instance is re-started.

I'm far to be considered an expert 😅 but I think this is not the cause. Because when this happen on my production env the application continue to work and the RAM is not freed up like happen when I recycling the app. Only the error continue to appear in the log files until I do a "real" recycling and I rebuild the index.

jtkech commented 1 year ago

Okay, thanks for the info.

Hmm, looking again at the stack trace, I see twice Lucene.Net.Index.IndexWriter..ctor, so first it would mean that you have 2 search indexes related to the content type of the item that is published, right?

Then, because for a given index we build a writer once, it would mean at least that the tenant was rebuild (for example after updating some admin settings), or that the application was re-started, not necessarily just before the failing publish but since the previous one.

So still possible that a writer was not well disposed, maybe we would need to check the underlying lucene lock more in depth, for example by using IndexWriter.IsLocked(directory). Hmm as I remember I suggested such things in an old PR where I wanted to make it working even if multiple instances was sharing the same file system.

In the meantime for info I let you read the writer dispose summary comments where we see that this may be a costly operation, an use case where the write lock will still be held, how to force the write lock to be released (dangerous) and so on. Also see the last note in case of a OutOfMemoryException.

    /// <summary>
    /// Commits all changes to an index, waits for pending merges
    /// to complete, and closes all associated files.
    /// <para/>
    /// This is a "slow graceful shutdown" which may take a long time
    /// especially if a big merge is pending: If you only want to close
    /// resources use <see cref="Rollback()"/>. If you only want to commit
    /// pending changes and close resources see <see cref="Dispose(bool)"/>.
    /// <para/>
    /// Note that this may be a costly
    /// operation, so, try to re-use a single writer instead of
    /// closing and opening a new one.  See <see cref="Commit()"/> for
    /// caveats about write caching done by some IO devices.
    ///
    /// <para> If an <see cref="Exception"/> is hit during close, eg due to disk
    /// full or some other reason, then both the on-disk index
    /// and the internal state of the <see cref="IndexWriter"/> instance will
    /// be consistent.  However, the close will not be complete
    /// even though part of it (flushing buffered documents)
    /// may have succeeded, so the write lock will still be
    /// held.</para>
    ///
    /// <para> If you can correct the underlying cause (eg free up
    /// some disk space) then you can call <see cref="Dispose()"/> again.
    /// Failing that, if you want to force the write lock to be
    /// released (dangerous, because you may then lose buffered
    /// docs in the <see cref="IndexWriter"/> instance) then you can do
    /// something like this:</para>
    ///
    /// <code>
    /// try 
    /// {
    ///     writer.Dispose();
    /// } 
    /// finally 
    /// {
    ///     if (IndexWriter.IsLocked(directory)) 
    ///     {
    ///         IndexWriter.Unlock(directory);
    ///     }
    /// }
    /// </code>
    /// 
    /// after which, you must be certain not to use the writer
    /// instance anymore.
    ///
    /// <para><b>NOTE</b>: if this method hits an <see cref="OutOfMemoryException"/>
    /// you should immediately dispose the writer, again.  See 
    /// <see cref="IndexWriter"/> for details.</para>
    /// </summary>
    /// <exception cref="IOException"> if there is a low-level IO error </exception>
    [MethodImpl(MethodImplOptions.NoInlining)]
    public void Dispose()