mbdavid / LiteDB

LiteDB - A .NET NoSQL Document Store in a single data file
http://www.litedb.org
MIT License
8.62k stars 1.25k forks source link

[BUG] currupt or damaged records version 5.0.5 and newer #1603

Open seertenedos opened 4 years ago

seertenedos commented 4 years ago

Version Running on linux docker using master version up to commit below as i needed one of the othere fixes. https://github.com/mbdavid/LiteDB/commit/96e4ee71c0250e1349d13e86730e5d62319f715d

Describe the bug When the database is used i am finding that some of my records are getting damaged. It seems to relate to fields of type Dictionary<string,guid> on the class i am saving but a standard unit test can't duplicate. Further details on the question i opened (https://github.com/mbdavid/LiteDB/issues/1600). I started debugging the read since i can't easily duplicate the write issue in a dev environment and it looks like a records gets over ridden by another or something strange like a byte being wrong causing things to get mixed up.

Code to Reproduce Use the attached database and the code below damaged database.zip

            using (var db = new LiteDatabase(@"O:\drive issues\VirtualFileSystem_litedb_stuffed.db"))
            {
                var col = db.GetCollection<VirtualDirectory>("directories");
                var dir = col.FindById(Guid.Parse("3e70872d-30cf-47f9-b360-782b034b6b95"));
            }

        public class VirtualDirectory
        {
            public Guid Id { get; set; }
            public ulong Size { get; set; }
            public uint RefCount { get; set; }
            public ushort Permission { get; set; }
            public DateTime ATime { get; set; }
            public DateTime MTime { get; set; }
            public DateTime CTime { get; set; }

            //info about dir ref count being 2 for empty directory
            //https://unix.stackexchange.com/a/153640

            public Dictionary<string, Guid> Files { get; set; } = new Dictionary<string, Guid>();
            public Dictionary<string, Guid> Directories { get; set; } = new Dictionary<string, Guid>();

            internal VirtualDirectory()
            {
            }
        }

Expected behavior The instance of the class should be returned correctly from the database

Screenshots/Stacktrace System.NotSupportedException: BSON type not supported at LiteDB.Engine.BufferReader.ReadElement(HashSet`1 remaining, String& name)

Additional context Add any other context about the problem here.

lbnascimento commented 4 years ago

@seertenedos Are you writing from multiple threads? Could you provide the code used for writing and the classes for the other collections? Any other info you could give us to help reproduce the issue?

lbnascimento commented 4 years ago

@seertenedos Could you provide us the code you use to access your datafile (or at least the relevant part of it)? You can send it to me privately at lbnascimento@inf.ufrgs.br if you prefer.

seertenedos commented 4 years ago

@lbnascimento i was goign to send you an invite to bitbucket repo but i think it will be better to just send the classes over email as it is easier to find relivent code. Yes it is multi threaded. Email should arrive shortly

lbnascimento commented 4 years ago

@seertenedos I made a small multithreaded program that used your Virtual File System (without the real disk IO, of course, only the LiteDB part) and I could not reproduce the issue. I left the program running for about an hour.

Could you provide more some example on how your VFS is being used when the issue happens? Some sample code would be useful. (Sorry about all the request, but this issue is really hard to reproduce)

seertenedos commented 4 years ago

@lbnascimento i can duplicate the issue in my running environment in about 10 minutes but i cant do it in an environment i can debug etc. I am working on a new version without the transactions where i do my own write locking but no read locks unless the read is linked to an update. That was one thing that confused me with the old designed was if the version i read in a transaction locked in the transaction since there does not seem to be an atomic update method. Anyway i hope to get the new version tested in the next 24 hours to see if it suffers the same issue. I will also email you some app logs as it will give you an idea of all the transactions/ roughly what calls are been made.

seertenedos commented 4 years ago

also yes i had the same issue of not being able to duplicate in a dev environment

seertenedos commented 4 years ago

@lbnascimento by using locks around all my write transactions the issue seems to have gone away. I will update my code to function that way when dealing with LiteDB and it should be fine.

seertenedos commented 4 years ago

@lbnascimento i spoke too soon. seems i was not catching all exceptions. Sadly i think i need to look at another DB for now as i keep hitting too many bugs in LiteDb and i am spending more time trying to find them code around them than my actual application. The project is great but there real needs to be a clear difference between the stable releases and the beta ones and really get the fixes released a little faster for testing for bad releases like the latest one was.

henon commented 4 years ago

@seertenedos you could try version 4.1.4, I am using it for a long time without any problems

henon commented 4 years ago

@lbnascimento by using locks around all my write transactions the issue seems to have gone away. I will update my code to function that way when dealing with LiteDB and it should be fine.

I second that. Also got corrupted collections until I started locking rigorously in v5.0.7

seertenedos commented 4 years ago

@henon I am thinking of coming back to litedb and using version v4.1.4 like you recommend. Since the doco is for the newer version now i was wondering for the older version did you need to do any locking yourself or the database managed it fine on its own. In my use case i have multiple threads reading and writing. about 2 times the readers to writers. Also if you needed to do manual locking did you just lock writers or readers as well?

Lastly for version v5.0.7 did you lock just for writers or for reading as well? I am basically running a filesystem via it so i need something that can take load but also does not corrupt itself. I have been using a concurrent Dictionary recently which has been working well but it does use a lot of memory and seralisation over head to keep a copy that is always a little out of date.

henon commented 4 years ago

@seertenedos: in my v4.1.4 use case I did no locking whatsoever! Now with v5.0.7 just to be sure I just locked everything, even the DB-connection. That removed all issues for me, no more corruptions since (on Azure, with local tests I never had any problems).

I suggest you write a test program that reads and writes the hell out of the DB from multiple threads. Then you can see the difference of not locking vs locking everything vs locking only writes. I didn't do that because I don't need performance, but I would still be interested in the results if you would share them with us.

asmejkal commented 4 years ago

Having the same issue since I've upgraded to v5. Similarly to the OP, it seems to be related to a Dictionary<string, BsonDocument> property (in C# it's a Dictionary<ulong, class>, but the ulong seems to get serialized as a plain string). I suspect this, because it only happens to this specific collection.

Some of the documents in this collection are getting randomly corrupted, which prevents them from deserializing. This happens to a few specific documents, while all the other documents in the collection can be deserialized just fine (despite also having data in the dictionary). I can export the document to JSON, fix it manually and Upsert it back into the collection. It works fine for a few minutes, but then it gets corrupted again. Same as OP, this is a multithreaded program.