mbdavid / LiteDB

LiteDB - A .NET NoSQL Document Store in a single data file
http://www.litedb.org
MIT License
8.52k stars 1.24k forks source link

Document size limit to be increased to 16Mb? #25

Closed sherry-ummen closed 9 years ago

sherry-ummen commented 9 years ago

Hello,

Is it possible to increase the document size limit from 1Mb to 16Mb like how mongodb has? If not then what is the complication?

Thanks

mbdavid commented 9 years ago

Hello @sherry-ummen! It's easy to change this limit from 1Mb to 16Mb, just need change here:

https://github.com/mbdavid/LiteDB/blob/master/LiteDB/Document/BsonDocument.cs#L14

But...

The question is: why your document are so big? I think 1Mb a realllllly big document, and I always try keep under 100Kb. Big document consume too memory (all document must be loaded into memory) and are too slow in read and write operations. Remember: LiteDB is an embedded database so this memory consume are from "client".

Take a look on MongoDB documents about data modeling. All mongodb data modeling concepts are valid to LiteDB: http://docs.mongodb.org/manual/core/data-modeling-introduction/

sherry-ummen commented 9 years ago

Thanks Mauricio.

Yes the document is text and its big. Basically some graphical objects related data. And its legacy code which generates big objects. So its very difficult to change the behavior.

Why is it slow? Is it the serializer which is slow? We are currently using mongodb and if the doc size is more than 16MB then we store it as Blob.

But we want embedded database. And Litedb suits best.

Sent from my Windows Phone


From: Mauricio Davidmailto:notifications@github.com Sent: ‎21/‎04/‎2015 16:07 To: mbdavid/LiteDBmailto:LiteDB@noreply.github.com Cc: Sherry Ummenmailto:sherry.ummen@outlook.com Subject: Re: [LiteDB] Document size limit to be increased to 16Mb? (#25)

Hello @sherry-ummen! It's easy to change this limit from 1Mb to 16Mb, just need change here:

https://github.com/mbdavid/LiteDB/blob/master/LiteDB/Document/BsonDocument.cs#L14

But...

The question is: why your document are so big? I think 1Mb a realllllly big document, and I always try keep under 100Kb. Big document consume too memory (all document must be loaded into memory) and are too slow in read and write operations. Remember: LiteDB is an embedded database so this memory consume are from "client".

Take a look on MongoDB documents about data modeling. All mongodb data modeling concepts are valid to LiteDB: http://docs.mongodb.org/manual/core/data-modeling-introduction/


Reply to this email directly or view it on GitHub: https://github.com/mbdavid/LiteDB/issues/25#issuecomment-94787112

mbdavid commented 9 years ago

There is no problem to serialize/deserialize big documents, LiteDB uses TextRead/TextWriter to avoid performance problems. But documents are treated as a single unit, so when you need read a big document, you need read all data pages, store all in memory (CacheService) and deserialize all bytes. To save is the same problem: a minimal change must serialize all document and write in all pages.

FileStorage (as MongoDB GridFS) works as a splitter content in separate documents. To store big files, LiteDB split content in 1MB chunks and store one at time. After each chunk, LiteDB clear cache to avoid use too many memory. https://github.com/mbdavid/LiteDB/blob/master/LiteDB/Database/FileStorage/LiteFileStorage.cs#L49

sherry-ummen commented 9 years ago

Ok so reading from all data pages is the problem? Then that should not be an issue in case of SSD ?. And will memory mapped i/o will help?

Sent from my Windows Phone


From: Mauricio Davidmailto:notifications@github.com Sent: ‎21/‎04/‎2015 22:32 To: mbdavid/LiteDBmailto:LiteDB@noreply.github.com Cc: Sherry Ummenmailto:sherry.ummen@outlook.com Subject: Re: [LiteDB] Document size limit to be increased to 16Mb? (#25)

There is no problem to serialize/deserialize big documents, LiteDB uses TextRead/TextWriter to avoid performance problems. But documents are treated as a single unit, so when you need read a big document, you need read all data pages, store all in memory (CacheService) and deserialize all bytes. To save is the same problem: a minimal change must serialize all document and write in all pages.

FileStorage (as MongoDB GridFS) works as a splitter content in separate documents. To store big files, LiteDB split content in 1MB chunks and store one at time. After each chunk, LiteDB clear cache to avoid use too many memory. https://github.com/mbdavid/LiteDB/blob/master/LiteDB/Database/FileStorage/LiteFileStorage.cs#L49


Reply to this email directly or view it on GitHub: https://github.com/mbdavid/LiteDB/issues/25#issuecomment-94913543

mbdavid commented 9 years ago

You will not avoid read all pages if your document is big and you need read all. To better performance, SSD disk are great and RAM memory too.

Read documents with 16Mb is not a big issue if you read one or two documents each time. If you have many, I recommend to "close" and "re-open" database (using(var db = new LiteDatabase(...) { ... }).

I have plans (it´s on my todo-list) to implement a better cache service that auto clear non-used cache pages, so it´s avoid to close/open database.

AhmedDeffous commented 5 months ago

Thanks Mauricio.

Yes the document is text and its big. Basically some graphical objects related data. And its legacy code which generates big objects. So its very difficult to change the behavior.

Why is it slow? Is it the serializer which is slow? We are currently using mongodb and if the doc size is more than 16MB then we store it as Blob.

But we want embedded database. And Litedb suits best.

Sent from my Windows Phone

From: Mauricio Davidmailto:notifications@github.com Sent: ‎21/‎04/‎2015 16:07 To: mbdavid/LiteDBmailto:LiteDB@noreply.github.com Cc: Sherry Ummenmailto:sherry.ummen@outlook.com Subject: Re: [LiteDB] Document size limit to be increased to 16Mb? (#25)

Hello @sherry-ummen! It's easy to change this limit from 1Mb to 16Mb, just need change here:

https://github.com/mbdavid/LiteDB/blob/master/LiteDB/Document/BsonDocument.cs#L14

But...

The question is: why your document are so big? I think 1Mb a realllllly big document, and I always try keep under 100Kb. Big document consume too memory (all document must be loaded into memory) and are too slow in read and write operations. Remember: LiteDB is an embedded database so this memory consume are from "client".

  • Can you split your document in more documents? You can use DbRef<T> to "join"
  • Your document contains a file or an ByteArray? You can use FileStorage
  • You document contains a big text? You can use FileStorage too.

Take a look on MongoDB documents about data modeling. All mongodb data modeling concepts are valid to LiteDB: http://docs.mongodb.org/manual/core/data-modeling-introduction/

Reply to this email directly or view it on GitHub: #25 (comment)