litedb-org / LiteDB

LiteDB - A .NET NoSQL Document Store in a single data file
http://www.litedb.org
MIT License
8.67k stars 1.25k forks source link

[QUESTION] Is LiteDB v5 is thread-safe for multiple reader/writer for a collection? [BUG?] #2002

Open u1-liquid opened 3 years ago

u1-liquid commented 3 years ago

I'm using LiteDB for per-process cache to caching complex queries results from external database. It works well basically except accessing almost same time to a collection first time by multiple threads.

e.g I have 10 tasks which have to grab same data from external database, and they fetched at almost same time, executed parallelly by 4 tasks. first at all, every tasks will checks cache before query to external database, like

var data = LiteDatabase.GetCollection("TableName").FindById("QueryHash");
if (data != null) return data;
...

because the "TableName" collection is not exists at first, one of them will execute query on external database, and trying to commit it to LiteDB... like

LiteDatabase.BeginTrans();
...
LiteDatabase.GetCollection("TableName").Upsert("QueryHash", data);
...
LiteDatabase.Commit();

the problem happens this timing. in this batch, it seems only 1 task(seems executed query itself) can access the collection, others (3 tasks) gets below exception

System.Exception: LiteDB ENSURE: page type must be collection page
   at LiteDB.Constants.ENSURE(Boolean conditional, String message)
   at LiteDB.Engine.CollectionPage..ctor(PageBuffer buffer)
   at LiteDB.Engine.BasePage.ReadPage[T](PageBuffer buffer)
   at LiteDB.Engine.Snapshot.ReadPage[T](UInt32 pageID, FileOrigin& origin, Int64& position, Int32& walVersion)
   at LiteDB.Engine.Snapshot.GetPage[T](UInt32 pageID, FileOrigin& origin, Int64& position, Int32& walVersion)
   at LiteDB.Engine.CollectionService.Get(String name, Boolean addIfNotExists, CollectionPage& collectionPage)
   at LiteDB.Engine.Snapshot..ctor(LockMode mode, String collectionName, HeaderPage header, UInt32 transactionID, TransactionPages transPages, LockService locker, WalIndexService walIndex, DiskReader reader, Boolean addIfNotExists)
   at LiteDB.Engine.TransactionService.<CreateSnapshot>g__create|42_0(<>c__DisplayClass42_0& )
   at LiteDB.Engine.TransactionService.CreateSnapshot(LockMode mode, String collection, Boolean addIfNotExists)
   at LiteDB.Engine.QueryExecutor.<>c__DisplayClass10_0.<<ExecuteQuery>g__RunQuery|0>d.MoveNext()
   at LiteDB.BsonDataReader..ctor(IEnumerable`1 values, String collection)
   at LiteDB.Engine.QueryExecutor.ExecuteQuery(Boolean executionPlan)
   at LiteDB.Engine.QueryExecutor.ExecuteQuery()
   at LiteDB.Engine.LiteEngine.Query(String collection, Query query)
   at LiteDB.LiteQueryable`1.ToDocuments()+MoveNext()
   at System.Linq.Enumerable.SelectEnumerableIterator`2.MoveNext()
   at System.Linq.Enumerable.TryGetFirst[TSource](IEnumerable`1 source, Boolean& found)
   at System.Linq.Enumerable.FirstOrDefault[TSource](IEnumerable`1 source)
   at LiteDB.LiteCollection`1.FindById(BsonValue id)

... and the next batches (left 6 tasks) executing well without any problem.

I think something bad happens if access collection during creating collection, because this does not happens if I wrap LiteDatabase.GetCollection("TableName").FindById("QueryHash")as ReadLock, and BeginTrans() ~ Commit() as WriteLock with ReaderWriterLockSlim.

Do I need to use lock if multiple reader / writer uses single collection? Or is this a some kind of bug?

ajweber commented 1 year ago

I would also like to understand if ILiteCollection<> is thread safe for multiple readers/writers in the same process. (Specifically v5.x)

Can one of the devs confirm we do not need to keep re-fetching the same Collection from the LiteDatabase to serve multiple threads?

Thanks.

ajweber commented 1 year ago

It would be nice to hear from the devs on this. Doesn't seem like much activity on github to actually indicate any interest in questions or bug reports and remediation.

I did find that keeping an open collection between multiple threads can cause minor data integrity issues in that if one thread inserts a doc, the collection isn't actually updated yet...possibly until a Checkpoint (or the db is closed). This is a little surprising in that I'm using the same collection object across the threads, so even items in the WAL should be known by the same collection object?

feitzi commented 1 month ago

Any updates to that topic?