umbraco / Umbraco-CMS

Umbraco is a free and open source .NET content management system helping you deliver delightful digital experiences.
https://umbraco.com
MIT License
4.49k stars 2.69k forks source link

The process cannot access the file NuCache.Content.db because it is being used by another process #5035

Closed robertjf closed 2 years ago

robertjf commented 5 years ago

PR: https://github.com/umbraco/Umbraco-CMS/pull/5924

Something is happening in Azure WebApps where the NuCache.Content.db file is locked causing the site to hang. I've attached the log file for reference.

The exception is as follows:

Umbraco.Core.Exceptions.BootFailedException: Boot failed: Umbraco cannot run. See Umbraco's log file for more details.

-> Umbraco.Core.Exceptions.BootFailedException: Boot failed.

-> System.IO.IOException: The process cannot access the file 'D:\home\site\wwwroot\App_Data\TEMP\NuCache\NuCache.Content.db' because it is being used by another process.
   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)
   at CSharpTest.Net.IO.TransactedCompoundFile..ctor(Options options)
   at CSharpTest.Net.Storage.BTreeFileStoreV2..ctor(Options options)
   at CSharpTest.Net.Collections.BPlusTree`2.OptionsV2.CreateStorage()
   at CSharpTest.Net.Collections.BPlusTree`2.NodeCacheBase..ctor(BPlusTreeOptions`2 options)
   at CSharpTest.Net.Collections.BPlusTree`2.NodeCacheNormal..ctor(BPlusTreeOptions`2 options)
   at CSharpTest.Net.Collections.BPlusTree`2..ctor(BPlusTreeOptions`2 ioptions)
   at Umbraco.Web.PublishedCache.NuCache.DataSource.BTree.GetTree(String filepath, Boolean exists)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService..ctor(Options options, IMainDom mainDom, IRuntimeState runtime, ServiceContext serviceContext, IPublishedContentTypeFactory publishedContentTypeFactory, IdkMap idkMap, IPublishedSnapshotAccessor publishedSnapshotAccessor, IVariationContextAccessor variationContextAccessor, IUmbracoContextAccessor umbracoContextAccessor, ILogger logger, IScopeProvider scopeProvider, IDocumentRepository documentRepository, IMediaRepository mediaRepository, IMemberRepository memberRepository, IDefaultCultureAccessor defaultCultureAccessor, IDataSource dataSource, IGlobalSettings globalSettings, ISiteDomainHelper siteDomainHelper, IEntityXmlSerializer entitySerializer, IPublishedModelFactory publishedModelFactory, UrlSegmentProviderCollection urlSegmentProviders)

...

This is on Umbraco 8.0.1 which was upgraded from Umbraco 8.0.0. (attachment will need to be changed from .txt to.json)

UmbracoTraceLog.RD2818786B7D96.20190319.txt

John-Blair commented 3 years ago

I am short on time so will make this my last comment on this particular problem. I don't think the problem is with Umbraco but your release process. In IIS you can shutdown the website and/or you can shutdown the app pool - if you just recycle the app pool you are pulling the rug from under the website - leaving locks in an uncertain state. I suggest you leave the app pool completely alone. Just shut down the website. Wait a period say 30 seconds to make sure it has completely shutdown and released all locks before continuing with your next Octopus deploy step. Umbraco has website shutdown code that will close the nucache and release all necessary locks - if you have your web.config setup properly. Hope this helps.

enkelmedia commented 3 years ago

Thanks @John-Blair in out scenario these locks are not released correctly OR something else happens during startup.

Its kind of strange, after looking at the code in PublishedSnapShotService i get the feeling that only one thread should run the code that logs the "Creating the content store, "-code but in our logs we see two threads almost at the same trying to create the store. Is this a problem?

image

Also, the boot is started by multiple threads at the same time... should this even be possible?

image

goliathdj commented 3 years ago

Hi people we seem to have the same issue , it occasionally goes away however this morning it started again and when investigating the logs it shows one thread had locked the file is there a fix for this ?

other than restarting the azure app service ?

Shazwazza commented 3 years ago

@enkelmedia those pics show 2x different appdomains trying to perform that action. Its not 2x threads in the same app domain. This is why we have MainDom, it controls access to file based assets between appdomains. If you are running on azure you must have SqlMainDomLock configured.

enkelmedia commented 3 years ago

Thanks @Shazwazza! I see that, but the thing I can’t wrap my head around is why one website in IIS would act as two app domains? Shouldn’t each website be just one app domain?

This is on prem, Windows Server 2019. I did try the SqlMainDomainLock but something still holds on to the NuCache files after the restart. I did manage to replicate this “manually” without Octopus by switching the physical path to the site. Maybe something very strange happens with IIS when the site restarts due to this type of change.

Shazwazza commented 3 years ago

There will always be more than one app domain in netframework (since very early days) when an appdomain 'restarts' because when you bump the web.config or restart, another appdomain starts while the current one is unwinding. An appdomain cannot be 'restarted', it can just be started or shutdown, netframework makes us think it's 'restarting' but it's just another appdomain starting while another is ending. You can in fact have several appdomains running at once if you have a constant shutdown loop. IIS also aids in this too and depends on how it's configured. This behavior is baked into netframework. This is why it's difficult to synchronize access to files in netframework that maintain locks (i.e. lucene, nucache, etc...). This is why MainDom exists. The default MainDom uses a system-wide semaphore named lock. This type of lock doesn't work in Azure or linux which is why we use the DB as the distributed lock with SqlMainDomLock.

MainDom will depend on your setup but sounds like perhaps SqlMainDomLock should be used in your scenario.

Keep in mind, you cannot 'share' (i.e. network share) files between 2 IIS instance without some very particular configurations.

If you switch the underlying physical path of a site while it's running (not sure how that can be done without a restart), then it won't have access to the original files that are locked in order to unlock them. File locks are always done by the OS. Once they are locked, the OS will keep them locked until the original thing that locked them unlocks them.

If you use the

    value="SqlMainDomLock" />
<add
    key="Umbraco.Core.LocalTempStorage"
    value="EnvironmentTemp" />

like you would on azure, this will store these types of files in the current processes %TEMP% directory. This is set using environment variables so it's possible you can redirect this anyways. I don't really understand why making each individual folder for logs, lucene and nucache would help in any way.

This locking issue to me sounds like an hosting configuration issue due to how deployments are being done.

enkelmedia commented 3 years ago

I do see the scenario that one app domain is in the process of shutting down while another starts but are you saying that its normal for the startup to run in two different app domains at the same time like my logs is showing? Or is this the shutdown loop that you are referring to? The logs that you see is from one website on one machine, nothing shared over network etc. But the "closing" app domain (which is the same website) of course logs to the same folder.

The change of the "Physical Folder"-path of the website in IIS will trigger a restart, I can see in the logs that locks are released for the MainDom-lock (log says something like "MainDom lock released". But during the startup the boot process throws because the NuCache files is still in use by something. But looking at the implementation in the Umbraco source I can see that the release-code does dispose the BTree-classes so given that - files should be "free/released" when the site start. The thing that make me wonder is that it seems from the logs as if two different app domains is starting at the same time, this could potentially be the issue, that one of these keeps the lock on the files while the other fails. But I don't understand why two app domains would start at the same time.

I mean one option for us is to rebuild the NuCache from db after each release - this would work fine for smaller sites but the current site we're migrating has lots of content (nucache-files around 25mb) so it would be very nice to find a way to avoid having to rebuild this during startup. But given that this issue is hard to nail, if we rebuild the NuCache its useful to be able to put ie. the Examine indexes in another folder to avoid having to rebuild this as well (I know that the LuceneDirectoryFactory can solve this). An interesting thing here is that in the setup that we have problem with here both share NuCache and ExamineFiles between the released - but it's always the NuCache files that are used by something else and that make the newly started website throw.

But at the end of the day, this does not look like a MainDom-issue more a issue with the NuCache files and locks on them, either because they are not released or because something in the startup locks them.

Edit: @Shazwazza I would be happy to share some of the logs files, maybe your trained eye can spot something that I miss. Let me know if you would like to have a look and where to send them.

enkelmedia commented 3 years ago

Looking more at this now.

I actually created a deploy-script that will copy the NuCache-folder and the DistCache-folder from TEMP in from the old release-folder to the new release-folder. This to avoid the lock-issues on the files. Still, the startup will run on 3-4 different AppDomains simultaneously, some of the App domains will say that the local db does not exist (but I'm 100% sure that the file is there) and the last AppDomain start will find the cache, still the first 3 AppDomains will try to build the cache again. This happens both when deploying via Octopus and when just recycling the App Pool.

This only happens when the site is under load, the stage-environment works as I would expect, using the copied temp-files. I'm actually not sure if this is a Umbraco issue, it feels strange that the AppDomains Init-method is executed multiple times simultaneously as this should not be the case according to the docs it should only run for the first request to the app after restart.

Another thing that is also interesting is that we get about 300-400 log files outputed in the log directory during the 60-70 seconds it takes for the site to start again efter the deploy

image

The problem with multiple log files is something that I have not seen in prev. version ie. we run a similar site on 8.8.1 with quite some load and this site only outputs one log file. I will admit that this is on a different server, different IIS etc.

John-Blair commented 3 years ago

Could you maybe take a step back and simplify your release? - as your approach is definitely non-standard - doing anything non-standard is likely to get you into a death spiral situation. My approach is just - stop the website, copy new release into previous release folder overwriting existing files (so you automatically get re-use of the existing TEMP files for nucache and examine - which are not included in the new release), and start the website back up. That's it. No messing about with recycling app pools or copying folders of temp files or re-pointing IIS to a new folder. Is there a good reason you cannot take my approach?

enkelmedia commented 3 years ago

@John-Blair The approach is the default behavior in Octopus Deploy, it's basically creating a new folder with the new release and changed the Websites "Phisycal Folder" property to point to the new folder. The thing is that we would like to avoid stopping the server before to avoid ugly errors for visitors, our goal is to make the folder-switch and have the site restart again - a visitor would just interpret this as a slow load of the page. It's not a manual recycle but something that happens as a side-effect of the folder-switch - as you might know a change to settings on AppPools or websites can result in restarts of the webapp.

@Shazwazza So any way, I continued my research around this, from what I understand, the deployment-process with Octopus will make multiple changes to different config files after the "Physical Folder"-change. This will result in multiple very rapid restarts of the Website and since the site is under load this means that after each config-change a small number of request might be start to be processed. Since IIS will continue to serve these request in the AppDomain that received the requests it starts to boot Umbraco in multiple AppDomains. The booting of Umbraco does not lock between AppDomains so each AppDomain (that is a result of numerus config changes by Octopus) will continue to boot Umbraco.

I guess this is why I see a lot of log files during startup, I've tried to create a custom build with a lock around the this code

        internal void HandleApplicationStart(object sender, EventArgs evargs)
        {
            // ******** THIS IS WHERE EVERYTHING BEGINS ********

            lock (_startupLock)
            {
                // create the register for the application, and boot
                // the boot manager is responsible for registrations
                var register = GetRegister();
                _runtime = GetRuntime();
                _runtime.Boot(register);
            }
        }

But this did not work netither withe a MarshalByRefObject or a Mutex - i really don't get why its booting multiple times.

At the end of the day, the way that we are deploying is kind of special and the fact that the site starts many times in a short period of time is probably something that is very hard to handle in the Umbraco core code.

As of now, I'll stick with my copy of the Cache and let the boot rebuild this ad random times - the lock issue is gone after we copy the cache before each release.

John-Blair commented 3 years ago

ok....fyi we used octopus deploy - 5 years ago not now - and we did not create a new folder for each release we re-used the same folder - so maybe checkout the config options on the version of octopus deploy you are using.

John-Blair commented 3 years ago

wrt not stopping the server...the way we approached that was have the site running on 2 servers and we changed the in-house dns config to switch to the standby server (which ran in read only mode) during an upgrade and then switch back once the upgrade was done and tested.

enkelmedia commented 3 years ago

@John-Blair Each release gets it own folder so that there is always a easy way to rollback to a working state if something goes wrong. Octopus would basically switch the "Phycial Path" of the website back to the old release if we rollback. This is standard Octopus setup with IIS and we will not deploy to the same folder - that is not an option for us.

We could stop the website before the release but this would mean that the site starts returning "Service Unavailable" to visitors and the total downtime will be longer using this approach. It would be an option but a bad option, the root problem is that the locks on the files does not get released when the app pool recycle, we have plenty of ways to work around that but I guess that the main idea with this issue here on GitHub is to figure out and solve the root cause and not provide work arounds.

Sounds like a nice setup with the DNS-switch but in our case we have around 15-20 sites and I would like to keep the deploy process as simple as possible.

Oxygen-cl commented 3 years ago

Hi, We are still experiencing issues on Azure solutions. Examine is having troubles opening and closing all the time. We have made the azure config changes, and upgraded to v8.15.1

The errors occures on app restarts, deploys and on reindexing.

We are seeing a couple of errors in the logs. 1: Cannot access the NuCache file: System.IO.IOException: The process cannot access the file 'D:\local\Temp\UmbracoData\147617b457ab74697cdb627fc1c4d9bb\NuCache\NuCache.Content.db' because it is being used by another process. at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath) at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost) at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options) at CSharpTest.Net.IO.TransactedCompoundFile..ctor(Options options) at CSharpTest.Net.Storage.BTreeFileStoreV2..ctor(Options options) at CSharpTest.Net.Collections.BPlusTree2.OptionsV2.CreateStorage() at CSharpTest.Net.Collections.BPlusTree2.NodeCacheBase..ctor(BPlusTreeOptions2 options) at CSharpTest.Net.Collections.BPlusTree2.NodeCacheNone..ctor(BPlusTreeOptions2 options) at CSharpTest.Net.Collections.BPlusTree2..ctor(BPlusTreeOptions2 ioptions) at Umbraco.Web.PublishedCache.NuCache.DataSource.BTree.GetTree(String filepath, Boolean exists, ContentDataSerializer contentDataSerializer) at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.MainDomRegister() at Umbraco.Core.Runtime.MainDom.Register(Action install, Action release, Int32 weight) at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.b__36_0() at System.Threading.LazyInitializer.EnsureInitializedCore[T](T& target, Boolean& initialized, Object& syncLock, Func1 valueFactory) at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.EnsureCaches() at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.CreatePublishedSnapshot(String previewToken) at System.Lazy1.CreateValue() at System.Lazy1.LazyInitValue() at DIF.UmbracO2.Services.DIFContentService.GetGlobalFolder(UmbracoContext umbracoContext) at DIF.UmbracO2.CustomIndexes.EmployeeIndex.EmployeeIndexPopulator.PopulateEmployees() at DIF.UmbracO2.CustomIndexes.EmployeeIndex.EmployeeIndexPopulator.PopulateIndexes(IReadOnlyList1 indexes) at Umbraco.Examine.IndexRebuilder.RebuildIndex(String indexName) at DIF.UmbracO2.Services.ViaRitzauImportService.StartPressImport(UmbracoContext umbracoContext)`

2: Reader is closed: 2021-08-31_8-30-00

  1. Azure error: High number of active threads processing ASP.NET requests Azure App Services detected that 166 threads are processing an ASP.NET request. High number of threads in a process can lead to thread pool starvation and context switching which lead to high CPU. GCFrame GCFrame HelperMethodFrame Examine.LuceneEngine.Providers.LuceneSearcher+ReaderReopener.ScheduleReopen Examine.LuceneEngine.Providers.LuceneSearcher.ValidateSearcher Examine.LuceneEngine.Providers.LuceneSearcher.GetAllIndexedFields Examine.LuceneEngine.Providers.BaseLuceneSearcher.CreateQuery Examine.LuceneEngine.Providers.BaseLuceneSearcher.CreateQuery DIF.UmbracO2.Services.NewsService.GetNewsList DIF.UmbracO2.Controllers.Api.NewsSearchController.GetNewsList DynamicClass.lambda_method System.Web.Http.Controllers.ReflectedHttpActionDescriptor+ActionExecutor+<>c__DisplayClass6_2.b__2 System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ExecuteAsync System.Web.Http.Controllers.ApiControllerActionInvoker+d__1.MoveNext System.Runtime.CompilerServices.AsyncTaskMethodBuilder1[[System.Canon mscorlib]].Start[[System.Web.Http.Controllers.ApiControllerActionInvoker+d1 System.Web.Http]] System.Web.Http.Controllers.ApiControllerActionInvoker.InvokeActionAsyncCore System.Web.Http.Controllers.ApiControllerActionInvoker.InvokeActionAsync System.Web.Http.Controllers.ActionFilterResult+d5.MoveNext System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.Canon mscorlib]].Start[[System.Web.Http.Controllers.ActionFilterResult+d5 System.Web.Http]] System.Web.Http.Controllers.ActionFilterResult.ExecuteAsync System.Web.Http.Filters.AuthorizationFilterAttribute+d3.MoveNext System.Runtime.CompilerServices.AsyncTaskMethodBuilder1[[System.__Canon mscorlib]].Start[[System.Web.Http.Filters.AuthorizationFilterAttribute+d__3 System.Web.Http]] System.Web.Http.Filters.AuthorizationFilterAttribute.ExecuteAuthorizationFilterAsyncCore System.Web.Http.Filters.AuthorizationFilterAttribute.System.Web.Http.Filters.IAuthorizationFilter.ExecuteAuthorizationFilterAsync System.Web.Http.Controllers.AuthorizationFilterResult.ExecuteAsync System.Web.Http.ApiController.ExecuteAsync System.Web.Http.Dispatcher.HttpControllerDispatcher+d__15.MoveNext System.Runtime.CompilerServices.AsyncTaskMethodBuilder1[[System.Canon mscorlib]].Start[[System.Web.Http.Dispatcher.HttpControllerDispatcher+d15 System.Web.Http]] System.Web.Http.Dispatcher.HttpControllerDispatcher.SendAsync System.Net.Http.HttpMessageInvoker.SendAsync System.Web.Http.Dispatcher.HttpRoutingDispatcher.SendAsync System.Net.Http.DelegatingHandler.SendAsync System.Web.Http.HttpServer+d24.MoveNext System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.Canon mscorlib]].Start[[System.Web.Http.HttpServer+d24 System.Web.Http]] System.Web.Http.HttpServer.SendAsync System.Net.Http.HttpMessageInvoker.SendAsync System.Web.Http.WebHost.HttpControllerHandler+d12.MoveNext System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[System.Web.Http.WebHost.HttpControllerHandler+d__12 System.Web.Http.WebHost]] System.Web.Http.WebHost.HttpControllerHandler.ProcessRequestAsyncCore System.Web.TaskAsyncHelper.BeginTask System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute System.Web.HttpApplication.ExecuteStepImpl System.Web.HttpApplication.ExecuteStep System.Web.HttpApplication+PipelineStepManager.ResumeSteps System.Web.HttpApplication.BeginProcessRequestNotification System.Web.HttpRuntime.ProcessRequestNotificationPrivate System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper System.Web.Hosting.PipelineRuntime.ProcessRequestNotification DomainNeutralILStubClass.IL_STUB_ReversePInvoke InlinedCallFrame InlinedCallFrame DomainNeutralILStubClass.IL_STUB_PInvoke System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper System.Web.Hosting.PipelineRuntime.ProcessRequestNotification DomainNeutralILStubClass.IL_STUB_ReversePInvoke ContextTransitionFrame `
jaandrews commented 2 years ago

I'm also running into this nucache error with a load balanced azure site. In my case, I suspect it's a latency issue, as the database in based in the US, but the only front end server that has had this issue is in France. If that is the cause, one potential solution would to add support for azure cosmos. Not sure that is feasible to implement, since it is a nosql database, but it should improve database read speeds for multi region sites.

p-m-j commented 2 years ago

@jaandrews which maindomlock implemenation are you using?

For azure this should be configured to SqlMainDomLock

jaandrews commented 2 years ago

@p-m-j I just have

<add key="Umbraco.Core.MainDom.Lock" value="SqlMainDomLock" />

in the web config. Site is running umbraco 8.17.1 with cloudflare acting as the load balancer in front of azure.

p-m-j commented 2 years ago

@jaandrews presumably you also have EnvironmentTemp for LocalTempStorageLocation

Any other details you can provide about the issue, is this on startup or just happens with no obvious trigger? Would you be able to send stack traces / logs?

jaandrews commented 2 years ago

@p-m-j Yeah, LocalTempStorage is already set to EnvironmentTemp. As for the error I got, it's the same one as at the beginning of this thread. After taking a second look at that log, I confirmed the error I encounter stems from a timeout error. The error message is "The thread has been aborted, because the request has timed out." which is then followed by the nucache error (see here for the error and the log messages leading up to it). From the logs it seems like it's timing out attempting to read a file (assuming it's the nucache file). Not sure what, if anything I can tweak to avoid that.

chriskarkowsky commented 2 years ago

We received this error yesterday as well with our Azure App Service. App was running, then failed to boot. A complete restart of the App Service was necessary to fix this.

"Umbraco.Core.Exceptions.BootFailedException: Boot failed: Umbraco cannot run. See Umbraco's log file for more details.\n\n-> Umbraco.Core.Exceptions.BootFailedException: Boot failed.\n\n-> System.IO.IOException: The process cannot access the file 'C:\\local\\Temp\\UmbracoData\\be080f813588b61e6bd0c5b32b217f8b\\NuCache\\NuCache.Content.db' because it is being used by another process.

<add key="Umbraco.Examine.LuceneDirectoryFactory" value="Examine.LuceneEngine.Directories.SyncTempEnvDirectoryFactory, Examine" /> <add key="Umbraco.Core.LocalTempStorage" value="EnvironmentTemp" />

Are both within the web.config, and

Umbraco.Core.MainDom.Lock = SqlMainDomLock is set within the Application Settings.

jaandrews commented 2 years ago

@chriskarkowsky Is there a timeout error in front of that error in your logs (would look something like the logs I linked to in my previous comment). If so increasing the executionTimout of the httpRuntime in the web config might help. I've been in touch with umbraco support and that's one of the solutions they've suggested. Makes sense, but I still need to try it myself. So it would be something like

<httpRuntime executionTimeout="1800"/>

which would have it timeout after 30 minutes. Not ideal, but would avoid the site crashing if it works as expected.

chriskarkowsky commented 2 years ago

@jaandrews I have this timeout exception: {"@t":"2022-02-17T20:48:44.8174278Z","@mt":"Unhandled controller exception occurred for request '{RequestUrl}'","@l":"Error","@x":"System.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out\r\n at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction)\r\n at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)\r\n `

However the timestamp was about an hour before the Boot Failure messages. It seems your case was right before the Boot Failure.

What I find strange is that has a Boot Failure log on the same ProcessID as it was currently running on. This means Umbraco is trying to boot itself again on the same Process? Restarting the App Service would spin up a new process and boot, which is why i'm guessing this what fixed my case.

robertjf commented 2 years ago

Just wanted to note that we have been experiencing this issue consistently over the past week with v9.3.0 not only on an Azure AppService, but also in local development (VS2022). The only thing that made any difference was disabling the Disk Cache databases as per the documentation on Umbraco. (https://our.umbraco.com/Documentation/Reference/V9-Config/NuCacheSettings/)

zBestData commented 2 years ago

Just wanted to note that we have been experiencing this issue consistently over the past week with v9.3.0 not only on an Azure AppService, but also in local development (VS2022). The only thing that made any difference was disabling the Disk Cache databases as per the documentation on Umbraco. (https://our.umbraco.com/Documentation/Reference/V9-Config/NuCacheSettings/)

Thank you! I have been having the same problem with Version 9.3.0!

robertjf commented 2 years ago

I can't help feeling that something changed in 9.3.0 that has exacerbated the issue - we only started seeing this behaviour when we upgraded from the previous version...

robertjf commented 2 years ago

I just had this re-occur in Umbraco Cloud on Umbraco 8.17.2:

Message:
An unhandled exception occurred

Exception:
System.IO.IOException: The process cannot access the file 'C:\local\Temp\UmbracoData\748c65bf441f4d693b51602dbfa2e300\NuCache\NuCache.Content.db' because it is being used by another process.
   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)
   at CSharpTest.Net.IO.TransactedCompoundFile..ctor(Options options)
   at CSharpTest.Net.Storage.BTreeFileStoreV2..ctor(Options options)
   at CSharpTest.Net.Collections.BPlusTree`2.OptionsV2.CreateStorage()
   at CSharpTest.Net.Collections.BPlusTree`2.NodeCacheBase..ctor(BPlusTreeOptions`2 options)
   at CSharpTest.Net.Collections.BPlusTree`2.NodeCacheNone..ctor(BPlusTreeOptions`2 options)
   at CSharpTest.Net.Collections.BPlusTree`2..ctor(BPlusTreeOptions`2 ioptions)
   at Umbraco.Web.PublishedCache.NuCache.DataSource.BTree.GetTree(String filepath, Boolean exists, ContentDataSerializer contentDataSerializer)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.MainDomRegister()
   at Umbraco.Core.Runtime.MainDom.Register(Action install, Action release, Int32 weight)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.<EnsureCaches>b__36_0()
   at System.Threading.LazyInitializer.EnsureInitializedCore[T](T& target, Boolean& initialized, Object& syncLock, Func`1 valueFactory)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.EnsureCaches()
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.CreatePublishedSnapshot(String previewToken)
   at System.Lazy`1.CreateValue()
   at System.Lazy`1.LazyInitValue()
   at Umbraco.Web.UmbracoInjectedModule.EnsureHasContent(UmbracoContext context, HttpContextBase httpContext)
   at Umbraco.Web.UmbracoInjectedModule.EnsureUmbracoRoutablePage(UmbracoContext context, HttpContextBase httpContext)
   at Umbraco.Web.UmbracoInjectedModule.ProcessRequest(HttpContextBase httpContext)
   at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
   at System.Web.HttpApplication.ExecuteStepImpl(IExecutionStep step)
   at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)

reading through the logs, I see this a little further down right before I get the NuCache errors above (~ every 4 seconds):


Message:
Exception ("4aa0a49a").
The thread has been aborted, because the request has timed out.

Exception:
System.Threading.ThreadAbortException: Thread was being aborted.
   at SNIReadSyncOverAsync(SNI_ConnWrapper* , SNI_Packet** , Int32 )
   at SNINativeMethodWrapper.SNIReadSyncOverAsync(SafeHandle pConn, IntPtr& packet, Int32 timeout)
   at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()
   at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()
   at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()
   at System.Data.SqlClient.TdsParserStateObject.TryReadByteArray(Byte[] buff, Int32 offset, Int32 len, Int32& totalRead)
   at System.Data.SqlClient.TdsParserStateObject.TryReadString(Int32 length, String& value)
   at System.Data.SqlClient.TdsParser.TryReadSqlStringValue(SqlBuffer value, Byte type, Int32 length, Encoding encoding, Boolean isPlp, TdsParserStateObject stateObj)
   at System.Data.SqlClient.TdsParser.TryReadSqlValue(SqlBuffer value, SqlMetaDataPriv md, Int32 length, TdsParserStateObject stateObj, SqlCommandColumnEncryptionSetting columnEncryptionOverride, String columnName)
   at System.Data.SqlClient.SqlDataReader.TryReadColumnInternal(Int32 i, Boolean readHeaderOnly)
   at System.Data.SqlClient.SqlDataReader.TryReadColumn(Int32 i, Boolean setTimeout, Boolean allowPartiallyReadColumn)
   at System.Data.SqlClient.SqlDataReader.GetValues(Object[] values)
   at NPoco.RowMappers.PropertyMapper.Map(DbDataReader dataReader, RowMapperContext context)
   at NPoco.MappingFactory.Map(DbDataReader dataReader, Object instance)
   at NPoco.Database.<Read>d__156`1.MoveNext()
   at NPoco.Database.<QueryImp>d__164`1.MoveNext()
   at Umbraco.Core.Persistence.NPocoDatabaseExtensions.<QueryPaged>d__8`1.MoveNext()
   at Umbraco.Web.PublishedCache.NuCache.DataSource.DatabaseDataSource.<GetAllContentSources>d__13.MoveNext()
   at Umbraco.Web.PublishedCache.NuCache.ContentStore.SetAllFastSortedLocked(IEnumerable`1 kits, Boolean fromDb)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.LoadContentFromDatabaseLocked(IScope scope, Boolean onStartup)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.LockAndLoadContent(Func`2 action)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.<EnsureCaches>b__36_0()
robertjf commented 2 years ago

@p-m-j I'm because this issue is re-opened, does that mean that Umbraco are actively investigating again?

p-m-j commented 2 years ago

Apologies, re-opened on a whim because it's clear from the activity there are still problems in this area, I shouldn't have re-opened as this issue already has an associated fix with a version number.

If you have steps to reproduce please open a new issue.

In this case, we fixed https://github.com/umbraco/Umbraco-CMS/pull/5924 for version 8.1.1 - which wasn't enough, but that's why this issue is closed and included in the release notes.

Sorry again for any confusion.

iqb-dawn commented 2 years ago

@p-m-j we are using Umbraco version 8.12.2. The site was running since 8 months until this morning we got this issue,

{
    "@t": "2022-03-17T08:17:25.2359719Z",
    "@mt": "{StartMessage} [Timing {TimingId}]",
    "StartMessage": "Resolving composer types.",
    "TimingId": "9a80bb3",
    "SourceContext": "Umbraco.Core.Runtime.CoreRuntime",
    "ProcessId": 12564,
    "ProcessName": "w3wp",
    "ThreadId": 1,
    "AppDomainId": 2,
    "AppDomainAppId": "LMW3SVC1154430189ROOT",
    "MachineName": "DW0SDWK0003EO",
    "Log4NetLevel": "INFO ",
    "HttpRequestNumber": 1,
    "HttpRequestId": "5104ab84-400a-4e30-8512-bd4ae8199a7c"
}
{
    "@t": "2022-03-17T08:17:33.4274703Z",
    "@mt": "{EndMessage} ({Duration}ms) [Timing {TimingId}]",
    "EndMessage": "Resolved.",
    "Duration": 8191,
    "TimingId": "9a80bb3",
    "SourceContext": "Umbraco.Core.Runtime.CoreRuntime",
    "ProcessId": 12564,
    "ProcessName": "w3wp",
    "ThreadId": 1,
    "AppDomainId": 2,
    "AppDomainAppId": "LMW3SVC1154430189ROOT",
    "MachineName": "DW0SDWK0003EO",
    "Log4NetLevel": "INFO ",
    "HttpRequestNumber": 1,
    "HttpRequestId": "5104ab84-400a-4e30-8512-bd4ae8199a7c"
}
{
    "@t": "2022-03-17T08:17:49.3132993Z",
    "@mt": "{FailMessage} ({Duration}ms) [Timing {TimingId}]",
    "@l": "Error",
    "@x": "Umbraco.Core.Exceptions.BootFailedException: Boot failed. ---> System.IO.IOException: The process cannot access the file 'C:\\local\\Temp\\UmbracoData\\a41b629950826bb5a8cb20844bdcd974\\NuCache\\NuCache.Content.db' because it is being used by another process.\r\n   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)\r\n   at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)\r\n   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)\r\n   at CSharpTest.Net.IO.TransactedCompoundFile..ctor(Options options)\r\n   at CSharpTest.Net.Storage.BTreeFileStoreV2..ctor(Options options)\r\n   at CSharpTest.Net.Collections.BPlusTree`2.OptionsV2.CreateStorage()\r\n   at CSharpTest.Net.Collections.BPlusTree`2.NodeCacheBase..ctor(BPlusTreeOptions`2 options)\r\n   at CSharpTest.Net.Collections.BPlusTree`2.NodeCacheNone..ctor(BPlusTreeOptions`2 options)\r\n   at CSharpTest.Net.Collections.BPlusTree`2..ctor(BPlusTreeOptions`2 ioptions)\r\n   at Umbraco.Web.PublishedCache.NuCache.DataSource.BTree.GetTree(String filepath, Boolean exists) in D:\\a\\1\\s\\src\\Umbraco.Web\\PublishedCache\\NuCache\\DataSource\\BTree.cs:line 27\r\n   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.MainDomRegister() in D:\\a\\1\\s\\src\\Umbraco.Web\\PublishedCache\\NuCache\\PublishedSnapshotService.cs:line 174\r\n   at Umbraco.Core.Runtime.MainDom.Register(Action install, Action release, Int32 weight) in D:\\a\\1\\s\\src\\Umbraco.Core\\Runtime\\MainDom.cs:line 89\r\n   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService..ctor(PublishedSnapshotServiceOptions options, IMainDom mainDom, IRuntimeState runtime, ServiceContext serviceContext, IPublishedContentTypeFactory publishedContentTypeFactory, IdkMap idkMap, IPublishedSnapshotAccessor publishedSnapshotAccessor, IVariationContextAccessor variationContextAccessor, IProfilingLogger logger, IScopeProvider scopeProvider, IDocumentRepository documentRepository, IMediaRepository mediaRepository, IMemberRepository memberRepository, IDefaultCultureAccessor defaultCultureAccessor, IDataSource dataSource, IGlobalSettings globalSettings, IEntityXmlSerializer entitySerializer, IPublishedModelFactory publishedModelFactory, UrlSegmentProviderCollection urlSegmentProviders) in D:\\a\\1\\s\\src\\Umbraco.Web\\PublishedCache\\NuCache\\PublishedSnapshotService.cs:line 128\r\n   at DynamicMethod(Object[] )\r\n   at LightInject.ServiceContainer.<>c__DisplayClass150_0.<WrapAsFuncDelegate>b__0() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3798\r\n   at LightInject.ServiceContainer.<>c__DisplayClass198_0.<EmitLifetime>b__1() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4657\r\n   at LightInject.PerContainerLifetime.GetInstance(Func`1 createInstance, Scope scope) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 6169\r\n   at LightInject.ServiceContainer.EmitLifetime(ServiceRegistration serviceRegistration, Action`1 emitMethod, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4656\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__1(IEmitter methodSkeleton) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.<>c__DisplayClass153_0.<CreateEmitMethodWrapper>b__0(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3856\r\n   at LightInject.ServiceContainer.EmitConstructorDependency(IEmitter emitter, Dependency dependency) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4158\r\n   at LightInject.ServiceContainer.EmitConstructorDependencies(ConstructionInfo constructionInfo, IEmitter emitter, Action`1 decoratorTargetEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4120\r\n   at LightInject.ServiceContainer.EmitNewInstanceUsingImplementingType(IEmitter emitter, ConstructionInfo constructionInfo, Action`1 decoratorTargetEmitMethod) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4080\r\n   at LightInject.ServiceContainer.EmitNewInstance(ServiceRegistration serviceRegistration, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4034\r\n   at LightInject.ServiceContainer.EmitNewInstanceWithDecorators(ServiceRegistration serviceRegistration, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3929\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__2(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.CreateDynamicMethodDelegate(Action`1 serviceEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3776\r\n   at LightInject.ServiceContainer.<>c__DisplayClass198_0.<EmitLifetime>b__1() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4657\r\n   at LightInject.PerContainerLifetime.GetInstance(Func`1 createInstance, Scope scope) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 6169\r\n   at LightInject.ServiceContainer.EmitLifetime(ServiceRegistration serviceRegistration, Action`1 emitMethod, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4656\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__1(IEmitter methodSkeleton) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.<>c__DisplayClass153_0.<CreateEmitMethodWrapper>b__0(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3856\r\n   at LightInject.ServiceContainer.EmitConstructorDependency(IEmitter emitter, Dependency dependency) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4158\r\n   at LightInject.ServiceContainer.EmitConstructorDependencies(ConstructionInfo constructionInfo, IEmitter emitter, Action`1 decoratorTargetEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4120\r\n   at LightInject.ServiceContainer.EmitNewInstanceUsingImplementingType(IEmitter emitter, ConstructionInfo constructionInfo, Action`1 decoratorTargetEmitMethod) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4080\r\n   at LightInject.ServiceContainer.EmitNewInstance(ServiceRegistration serviceRegistration, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4034\r\n   at LightInject.ServiceContainer.EmitNewInstanceWithDecorators(ServiceRegistration serviceRegistration, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3929\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__2(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.CreateDynamicMethodDelegate(Action`1 serviceEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3776\r\n   at LightInject.ServiceContainer.<>c__DisplayClass198_0.<EmitLifetime>b__1() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4657\r\n   at LightInject.PerContainerLifetime.GetInstance(Func`1 createInstance, Scope scope) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 6169\r\n   at LightInject.ServiceContainer.EmitLifetime(ServiceRegistration serviceRegistration, Action`1 emitMethod, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4656\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__1(IEmitter methodSkeleton) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.<>c__DisplayClass153_0.<CreateEmitMethodWrapper>b__0(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3856\r\n   at LightInject.ServiceContainer.EmitConstructorDependency(IEmitter emitter, Dependency dependency) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4158\r\n   at LightInject.ServiceContainer.EmitConstructorDependencies(ConstructionInfo constructionInfo, IEmitter emitter, Action`1 decoratorTargetEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4120\r\n   at LightInject.ServiceContainer.EmitNewInstanceUsingImplementingType(IEmitter emitter, ConstructionInfo constructionInfo, Action`1 decoratorTargetEmitMethod) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4080\r\n   at LightInject.ServiceContainer.EmitNewInstance(ServiceRegistration serviceRegistration, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4034\r\n   at LightInject.ServiceContainer.EmitNewInstanceWithDecorators(ServiceRegistration serviceRegistration, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3929\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__2(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.CreateDynamicMethodDelegate(Action`1 serviceEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3776\r\n   at LightInject.ServiceContainer.<>c__DisplayClass198_0.<EmitLifetime>b__1() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4657\r\n   at LightInject.PerContainerLifetime.GetInstance(Func`1 createInstance, Scope scope) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 6169\r\n   at LightInject.ServiceContainer.EmitLifetime(ServiceRegistration serviceRegistration, Action`1 emitMethod, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4656\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__1(IEmitter methodSkeleton) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.<>c__DisplayClass153_0.<CreateEmitMethodWrapper>b__0(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3856\r\n   at LightInject.ServiceContainer.CreateDynamicMethodDelegate(Action`1 serviceEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3776\r\n   at LightInject.ServiceContainer.CreateDelegate(Type serviceType, String serviceName, Boolean throwError) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4743\r\n   at LightInject.ServiceContainer.CreateDefaultDelegate(Type serviceType, Boolean throwError) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4705\r\n   at LightInject.ServiceContainer.GetInstance(Type serviceType) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3437\r\n   at Umbraco.Core.Composing.LightInject.LightInjectContainer.GetInstance(Type type) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\LightInject\\LightInjectContainer.cs:line 111\r\n   at Umbraco.Core.Composing.ComponentCollectionBuilder.CreateItem(IFactory factory, Type itemType) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\ComponentCollectionBuilder.cs:line 33\r\n   at Umbraco.Core.Composing.CollectionBuilderBase`3.<>c__DisplayClass10_0.<CreateItems>b__0(Type x) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\CollectionBuilderBase.cs:line 103\r\n   at System.Linq.Enumerable.WhereSelectArrayIterator`2.MoveNext()\r\n   at System.Linq.Buffer`1..ctor(IEnumerable`1 source)\r\n   at System.Linq.Enumerable.ToArray[TSource](IEnumerable`1 source)\r\n   at Umbraco.Core.Composing.CollectionBuilderBase`3.CreateItems(IFactory factory) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\CollectionBuilderBase.cs:line 102\r\n   at Umbraco.Core.Composing.ComponentCollectionBuilder.CreateItems(IFactory factory) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\ComponentCollectionBuilder.cs:line 25\r\n   at Umbraco.Core.Composing.CollectionBuilderBase`3.CreateCollection(IFactory factory) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\CollectionBuilderBase.cs:line 120\r\n   at Umbraco.Core.Composing.LightInject.LightInjectContainer.<>c__DisplayClass20_0`1.<Register>b__0(IServiceFactory f) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\LightInject\\LightInjectContainer.cs:line 172\r\n   at DynamicMethod(Object[] )\r\n   at LightInject.ServiceContainer.<>c__DisplayClass150_0.<WrapAsFuncDelegate>b__0() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3798\r\n   at LightInject.ServiceContainer.<>c__DisplayClass198_0.<EmitLifetime>b__1() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4657\r\n   at LightInject.PerContainerLifetime.GetInstance(Func`1 createInstance, Scope scope) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 6169\r\n   at LightInject.ServiceContainer.EmitLifetime(ServiceRegistration serviceRegistration, Action`1 emitMethod, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4656\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__1(IEmitter methodSkeleton) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.<>c__DisplayClass153_0.<CreateEmitMethodWrapper>b__0(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3856\r\n   at LightInject.ServiceContainer.CreateDynamicMethodDelegate(Action`1 serviceEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3776\r\n   at LightInject.ServiceContainer.CreateDelegate(Type serviceType, String serviceName, Boolean throwError) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4743\r\n   at LightInject.ServiceContainer.CreateDefaultDelegate(Type serviceType, Boolean throwError) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4705\r\n   at LightInject.ServiceContainer.GetInstance(Type serviceType) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3437\r\n   at Umbraco.Core.Composing.LightInject.LightInjectContainer.GetInstance(Type type) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\LightInject\\LightInjectContainer.cs:line 111\r\n   at Umbraco.Core.FactoryExtensions.GetInstance[T](IFactory factory) in D:\\a\\1\\s\\src\\Umbraco.Core\\FactoryExtensions.cs:line 23\r\n   at Umbraco.Core.Runtime.CoreRuntime.Boot(IRegister register, DisposableTimer timer) in D:\\a\\1\\s\\src\\Umbraco.Core\\Runtime\\CoreRuntime.cs:line 204\r\n   --- End of inner exception stack trace ---",
    "FailMessage": "Boot failed.",
    "Duration": 99252,
    "TimingId": "3b6648c",
    "SourceContext": "Umbraco.Core.Runtime.CoreRuntime",
    "ProcessId": 12564,
    "ProcessName": "w3wp",
    "ThreadId": 1,
    "AppDomainId": 2,
    "AppDomainAppId": "LMW3SVC1154430189ROOT",
    "MachineName": "DW0SDWK0003EO",
    "Log4NetLevel": "ERROR",
    "HttpRequestNumber": 1,
    "HttpRequestId": "5104ab84-400a-4e30-8512-bd4ae8199a7c"
}

We have these setings in Azure,

  {
    "name": "Umbraco.Core.LocalTempStorage",
    "value": "EnvironmentTemp",
    "slotSetting": false
  },
  {
    "name": "Umbraco.Core.MainDom.Lock",
    "value": "SqlMainDomLock",
    "slotSetting": false
  },
  {
    "name": "Umbraco.Examine.LuceneDirectoryFactory",
    "value": "Examine.LuceneEngine.Directories.SyncTempEnvDirectoryFactory, Examine",
    "slotSetting": false
  }

I have restarted the Azure web app to make it work. There is only one server. We are fear that this will happen again, What can we do?

p-m-j commented 2 years ago

@iqb-dawn if it’s a small site the easiest thing that guarantees you will never get exceptions thrown relating to the nucache physical files is to just turn off the nucache physical files see example here https://our.umbraco.com/documentation/Reference/V9-Config/NuCacheSettings/#additional-settings, the only down side being for a large site it may take longer to boot.

soreng commented 2 years ago

@p-m-j as per above

we are using Umbraco version 8.12.2

Switching of the file creation is a v9 feature AFAIK.

p-m-j commented 2 years ago

Switching off the file creation is a v9 feature AFAIK.

The same setting was around in v8, the documentation for it is hidden away on this page https://our.umbraco.com/Documentation/Fundamentals/Setup/Server-Setup/Load-Balancing/azure-web-apps-v8 (search this page for PublishedSnapshotServiceOptions)


I'm going to lock conversation here, please open a new issue if you have steps to reproduce or detailed logs.