umbraco / Umbraco-CMS

Umbraco is a free and open source .NET content management system helping you deliver delightful digital experiences.
https://umbraco.com
Other
4.49k stars 2.69k forks source link

NuCache corruption when block size set > 8192 #12103

Closed robertjf closed 1 year ago

robertjf commented 2 years ago

Which exact Umbraco version are you using? For example: 9.0.1 - don't just write v9

9.3.1

Bug summary

Firstly, this is a bug in the CSharpTest.Net.Collections dependency, described here in 2020: https://github.com/csharptest/CSharpTest.Net.Collections/issues/17 and has a PR which resolves it here: https://github.com/csharptest/CSharpTest.Net.Collections/pull/18

I've confirmed that the issue still exists in the Net20standard build that Umbraco depends on (https://github.com/mamift/CSharpTest.Net.Collections/) and I've submitted a PR on that repository, however Umbraco may want to consider taking on maintaining the library if mamift doesn't want to, the last update was back in 2019.

The reason this is such an issue is because of the Block Editor - I have a page with around 10 blocks of content in the block editor, the length of the serialised data attempting to be saved in the NuCache is 33,589,226 bytes. At a BlockSize of 8096 this failed with an invalid length error, and increasing to anything larger (e.g. 16384) was creating a corrupted localdb file.

Additionally, if the data being stored is too large for the default Block Size (512?) then a meaningful message should be logged to help the developer determine the reason and resolution.

Perhaps a Health Check can be created to assist with this?

The only work around currently is to disable the localDb generation altogether as per the documentation: https://our.umbraco.com/documentation/Reference/V9-Config/NuCacheSettings/#additional-settings

Specifics

Error message when Umbraco.CMS.NuCache.BTreeBlockSize <= 8192:

ArgumentOutOfRangeException: Specified argument was out of the range of valid values. (Parameter 'length')
CSharpTest.Net.IO.TransactedCompoundFile.Write(uint handle, byte[] bytes, int offset, int length)
CSharpTest.Net.Storage.BTreeFileStoreV2.Update<T>(IStorageHandle handleIn, ISerializer<T> serializer, T node)
CSharpTest.Net.Collections.BPlusTree<TKey, TValue>+NodeCacheBase.SaveChanges(NodePin pin)
CSharpTest.Net.Collections.BPlusTree<TKey, TValue>+NodeTransaction.PerformCommit()
CSharpTest.Net.Collections.BPlusTree<TKey, TValue>.Insert<T>(NodePin thisLock, TKey key, ref T value, NodePin parent, int parentIx)
CSharpTest.Net.Collections.BPlusTree<TKey, TValue>.Insert<T>(NodePin thisLock, TKey key, ref T value, NodePin parent, int parentIx)
CSharpTest.Net.Collections.BPlusTree<TKey, TValue>.Insert<T>(NodePin thisLock, TKey key, ref T value, NodePin parent, int parentIx)
CSharpTest.Net.Collections.BPlusTree<TKey, TValue>.Insert<T>(NodePin thisLock, TKey key, ref T value, NodePin parent, int parentIx)
CSharpTest.Net.Collections.BPlusTree<TKey, TValue>.AddEntry<T>(TKey key, ref T info)
Umbraco.Cms.Infrastructure.PublishedCache.ContentStore.Release(WriteLockInfo lockInfo, bool commit)
Umbraco.Cms.Infrastructure.PublishedCache.ContentStore+ScopedWriteLock.Release(bool completed)
Umbraco.Cms.Core.Scoping.ScopeContextualBase.Dispose()
Umbraco.Cms.Infrastructure.PublishedCache.PublishedSnapshotService.LockAndLoadContent(Func<bool> action)
Umbraco.Cms.Infrastructure.PublishedCache.PublishedSnapshotService.<EnsureCaches>b__47_0()

Error message when Umbraco.CMS.NuCache.BTreeBlockSize > 8192 (16384):

InvalidDataException: Found invalid data while decoding.
CSharpTest.Net.IO.TransactedCompoundFile+FileSection.Read(ref BlockRef block, bool headerOnly, FGet fget)
CSharpTest.Net.IO.TransactedCompoundFile.Read(uint handle)
CSharpTest.Net.Storage.BTreeFileStoreV2.TryGetNode<TNode>(IStorageHandle handleIn, out TNode node, ISerializer<TNode> serializer)
CSharpTest.Net.Collections.BPlusTree<TKey, TValue>+NodeCacheNone.Lock(NodePin parent, LockType ltype, NodeHandle child)
CSharpTest.Net.Collections.BPlusTree<TKey, TValue>.Insert<T>(NodePin thisLock, TKey key, ref T value, NodePin parent, int parentIx)
CSharpTest.Net.Collections.BPlusTree<TKey, TValue>.Insert<T>(NodePin thisLock, TKey key, ref T value, NodePin parent, int parentIx)
CSharpTest.Net.Collections.BPlusTree<TKey, TValue>.Insert<T>(NodePin thisLock, TKey key, ref T value, NodePin parent, int parentIx)
CSharpTest.Net.Collections.BPlusTree<TKey, TValue>.AddEntry<T>(TKey key, ref T info)
Umbraco.Cms.Infrastructure.PublishedCache.ContentStore.Release(WriteLockInfo lockInfo, bool commit)
Umbraco.Cms.Infrastructure.PublishedCache.ContentStore+ScopedWriteLock.Release(bool completed)
Umbraco.Cms.Core.Scoping.ScopeContextualBase.Dispose()
Umbraco.Cms.Infrastructure.PublishedCache.PublishedSnapshotService.LockAndLoadContent(Func<bool> action)
Umbraco.Cms.Infrastructure.PublishedCache.PublishedSnapshotService.<EnsureCaches>b__47_0()

Steps to reproduce

  1. Increase Umbraco.CMS.NuCache.BTreeBlockSize to 16384 in the appSettings.json file
  2. Generate enough content in a node to exceed a length of > 3 bytes (16,777,216) in the NuCache.

Expected result / actual result

NuCache should be populated successfully.

If the data is too large for NuCache, then a meaningful and helpful error should be logged instead of the default ArgumentOutOfRangeException.


_This item has been added to our backlog AB#17823_

robertjf commented 2 years ago

This issue is also discussed in #8447 which was closed due to inactivity.

robertjf commented 2 years ago

@nul800sebastiaan any feedback on this? would really like to be able to use the local database :)

robertjf commented 2 years ago

Just reading the Umbraco 10 notes - is it feasible to use SqlLite as the backing store file for NuCache?

nul800sebastiaan commented 2 years ago

I see the naming on the documentation page is a bit unfortunate: IgnoreLocalDb = true doesn't refer to Umbraco data being in a Microsoft LocalDb instance but to storing the NuCache files on disk (which is a local database). So, the answer is no: SqlLite will replace SQL CE, not for storing caches, wouldn't want to store a cache in yet another relational database I'd say.

I did ask around about this but the feedback isn't quite helpful for you I'm afraid. Ultimately, yes, we want to move away from the current NuCache storage mechanism. Our only option at the moment would be to fork the library you linked to and that's not something we're ready to take on.

So that leaves us with the last option of throwing a better error message, but that also doesn't solve all that much. However, the error could mention that this much data can't be stored in NuCache and to fix it you need to turn of storing the NuCache on disk.

I'm a bit surprised though, as there is plenty of sites now with large amounts of block data, you'd think a lot more people would be running into this issue by now.

nul800sebastiaan commented 2 years ago

I have a page with around 10 blocks of content in the block editor, the length of the serialised data attempting to be saved in the NuCache is 33,589,226 bytes.

Is my math off or is that.. 33 megabytes of text? What is in those blocks? 😅

robertjf commented 2 years ago

Hey @nul800sebastiaan yes, I know it's not using a Relational database right now :). Was just spitballing on alternatives for file storage.

I also get that Umbraco don't want to take on maintaining the BTree implementation that is currently being used; it's a pity it's been abandoned effectively.

Re the comment about the content size - yes, you read right - one of the data was actually that long! What I've noticed but as yet to track down how and why, is that the more you save the content, it seems that the block data grows - I suspect it's something to do with the serialisation, because a serialised block data object contains a lot of escaped slashes - I posted about this on the Contentment repository (https://github.com/leekelleher/umbraco-contentment/issues/206) before realising it was actually an issue with the Block List.

Here's an (incomplete) example of actual data from one of the elements in a list:


  {
    "contentTypeKey": "07a02735-44df-48f1-a572-1212cfb2b7e4",
    "udi": "umb://element/e3f318e96ba143de883df144b6bebe48",
    "justify": ""\"\\\"\\\\\\\"\\\\\\\\\\\\\\\"\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"Center\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"\\\\\\\\\\\\\\\"\\\\\\\"\\\"\"",
    ""underlinedText": "",
    "underlineColour": {
    "value": "ff80a9",
    "label": "Love",
    "sortOrder": 3,
    "id": "4"
  },
  "underlineStyle": ["ChalkLine1"],
  "blockPartitionStyle": ""
}

It seems that the best approach to this problem may be to work out why serialising a block produces so much of "chatter" like this?

bergmania commented 1 year ago

~Fixed by updated dependency in 11.1~

bergmania commented 1 year ago

Postponed to v12, due to breaking change in the dependency

tubbecool commented 1 year ago

Just got this error when updating to release-candiate v12. Did you solve the error?

{"@t":"2023-06-02T07:39:23.2007687Z","@mt":"Database configuration failed","@l":"Error","@x":"System.IO.InvalidDataException: Found invalid data while decoding.\r\n at CSharpTest.Net.IO.TransactedCompoundFile.FileSection.Read(BlockRef& block, Boolean headerOnly, FGet fget)\r\n at CSharpTest.Net.IO.TransactedCompoundFile.Read(UInt32 handle)\r\n at CSharpTest.Net.Storage.BTreeFileStoreV2.OpenRoot(Boolean& isNew)\r\n at CSharpTest.Net.Collections.BPlusTree2.StorageCache.OpenRoot(Boolean& isNew)\r\n at CSharpTest.Net.Collections.BPlusTree2.NodeCacheNone.LoadStorage()\r\n at CSharpTest.Net.Collections.BPlusTree2.NodeCacheBase.Load()\r\n at CSharpTest.Net.Collections.BPlusTree2..ctor(BPlusTreeOptions2 ioptions)\r\n at CSharpTest.Net.Collections.BPlusTree2..ctor(OptionsV2 optionsV2)\r\n at Umbraco.Cms.Infrastructure.PublishedCache.DataSource.BTree.GetTree(String filepath, Boolean exists, NuCacheSettings settings, ContentDataSerializer contentDataSerializer)\r\n at Umbraco.Cms.Infrastructure.PublishedCache.PublishedSnapshotService.MainDomRegister()\r\n at Umbraco.Cms.Core.Runtime.MainDom.Register(Action install, Action release, Int32 weight)\r\n at Umbraco.Cms.Infrastructure.PublishedCache.PublishedSnapshotService.<EnsureCaches>b__52_0()\r\n at System.Threading.LazyInitializer.EnsureInitializedCore[T](T& target, Boolean& initialized, Object& syncLock, Func1 valueFactory)\r\n at System.Threading.LazyInitializer.EnsureInitialized[T](T& target, Boolean& initialized, Object& syncLock, Func1 valueFactory)\r\n at Umbraco.Cms.Infrastructure.PublishedCache.PublishedSnapshotService.EnsureCaches()\r\n at Umbraco.Cms.Infrastructure.PublishedCache.PublishedSnapshotService.Notify(JsonPayload[] payloads, Boolean& draftChanged, Boolean& publishedChanged)\r\n at Umbraco.Cms.Core.Cache.ContentCacheRefresher.NotifyPublishedSnapshotService(IPublishedSnapshotService service, AppCaches appCaches, JsonPayload[] payloads)\r\n at Umbraco.Cms.Core.Cache.ContentCacheRefresher.Refresh(JsonPayload[] payloads)\r\n at Umbraco.Cms.Infrastructure.Sync.ServerMessengerBase.DeliverLocal[TPayload](ICacheRefresher refresher, TPayload[] payload)\r\n at Umbraco.Cms.Infrastructure.Sync.ServerMessengerBase.Deliver[TPayload](ICacheRefresher refresher, TPayload[] payload)\r\n at Umbraco.Cms.Infrastructure.Sync.ServerMessengerBase.QueueRefresh[TPayload](ICacheRefresher refresher, TPayload[] payload)\r\n at Umbraco.Cms.Core.Cache.DistributedCache.RefreshByPayload[TPayload](Guid refresherGuid, TPayload[] payload)\r\n at Umbraco.Extensions.DistributedCacheExtensions.RefreshAllContentCache(DistributedCache dc)\r\n at Umbraco.Extensions.DistributedCacheExtensions.RefreshAllPublishedSnapshot(DistributedCache dc)\r\n at Umbraco.Cms.Infrastructure.Migrations.MigrationPlanExecutor.RebuildCache()\r\n at Umbraco.Cms.Infrastructure.Migrations.MigrationPlanExecutor.ExecutePlan(MigrationPlan plan, String fromState)\r\n at Umbraco.Cms.Infrastructure.Migrations.Upgrade.Upgrader.Execute(IMigrationPlanExecutor migrationPlanExecutor, ICoreScopeProvider scopeProvider, IKeyValueService keyValueService)\r\n at Umbraco.Cms.Infrastructure.Migrations.Install.DatabaseBuilder.UpgradeSchemaAndData(UmbracoPlan plan)","SourceContext":"Umbraco.Cms.Infrastructure.Migrations.Install.DatabaseBuilder","ActionId":"4a69b790-d129-4a31-b256-ac7b8faa3ca1","ActionName":"Umbraco.Cms.Web.BackOffice.Install.InstallApiController.PostPerformInstall (Umbraco.Web.BackOffice)","RequestId":"800001c4-0004-e800-b63f-84710c7967bb","RequestPath":"/install/api/PostPerformInstall","ProcessId":36672,"ProcessName":"iisexpress","ThreadId":14,"ApplicationId":"423ce9db6e875136f60553bbd68b14db56b16ac4","MachineName":"STOL044","Log4NetLevel":"ERROR","HttpRequestId":"10cbd19c-eea9-4993-b52f-07691e77ec94","HttpRequestNumber":4,"HttpSessionId":"edda9b80-d09f-a901-39ed-ce7902d5b2c4"}

bergmania commented 1 year ago

Hi @tubbecool..

Where is the nucache files stored? There should be a migration that removes the old files, so new ones will be build with the new block size

tubbecool commented 1 year ago

I guess it's stored in the database when the application is active. I don't know actually. How do I check this when running on IISExpress?

Now I just went with the IgnoreLocalDb = true to get along with the upgrade.

tubbecool commented 1 year ago

I guess it's stored in the database when the application is active. I don't know actually. How do I check this when running on IISExpress?

Now I just went with the IgnoreLocalDb = true to get along with the upgrade.

I solved my error of NuCache corruption after upgrading to Umbraco 12 by deleting all files in C:\Users{ComputerName}\AppData\Local\Temp\UmbracoData . This is where my NuCache-files were stored.

For deployed environments I found the NuCache in C:\Windows\Temp\UmbracoData (IIS, Windows Server)

PrettyDevelopers commented 7 months ago

If anyone has issues with this still despite doing anything above...

Go the the Umbraco folder - Data - TEMP Empty it. It will be the NuCache folder in here being the issue but just clear everything. It should load.