litedb-org / LiteDB

LiteDB - A .NET NoSQL Document Store in a single data file
http://www.litedb.org
MIT License
8.64k stars 1.25k forks source link

Trim whitespace - Default enabled? (Data manipulation enabled by default?) #822

Open JaisEdelmann opened 6 years ago

JaisEdelmann commented 6 years ago

It seem's that by default LiteDb trims whitespace on data saved, this has caused us a bit of headache trying and figure out whats going on.

I don't think it's a great idea that by default the database is actually manipulating the data you are expecting you are saving.

By default this really should be disabled, if you ask me LiteDb should not have any hidden data manipulation :)

mbdavid commented 6 years ago

Hi @JaisEdelmann, yep, it's a very old decision based in reduce document size as much as possible during serialization.

This 3 options has same idea: reduce serialization

/// <summary>
/// Indicate that mapper do not serialize null values (default false)
/// </summary>
public bool SerializeNullValues { get; set; }

/// <summary>
/// Apply .Trim() in strings when serialize (default true)
/// </summary>
public bool TrimWhitespace { get; set; }

/// <summary>
/// Convert EmptyString to Null (default true)
/// </summary>
public bool EmptyStringToNull { get; set; }

I also based this decision in SQL Server (that I use everyday work) that trim results before insert into a VARCHAR column.

milleniumbug commented 6 years ago

Just got bitten by this. I believe these defaults are very unfortunate, as they break expectation that round-trip deserialize(serialize(object)) doesn't change the object's state.

oholsen commented 5 years ago

I agree with the above: it is better to optimize when one knows it is safer, rather to enable the expected.

nightroman commented 4 years ago

I was about to open a bug... but found this and several closed issues with the same explanation "set TrimWhitespace to false".

https://github.com/mbdavid/litedb/issues?q=TrimWhitespace

I wonder if anybody would ever asked "Why strings are not trimmed?" :)

I also based this decision in SQL Server (that I use everyday work) that trim results before insert into a VARCHAR column.

Is this really true? Maybe, I do not know, but I am surprised then. But it does not trim TEXT. E.g. I have just tried SQLite ("LiteDB in SQL world"), it does not trim NTEXT. MongoDB (LiteDB inspiration) does not trim strings, right?

I think this design decision is very unfortunate. It is easy to work around and configure, indeed. But many are trapped first, maybe with some pain.

This design decision is not consistent, too. This rule does not apply to stored BsonDocuments. In my scenarios some data come to the DB as typed (trimmed) and some logically same come as documents (not trimmed). Then they are queried back with inconsistent spaces.

Why strings in user data have leading and trailing spaces in the first place? I would say, because they are needed. Special cases are up to an app to trim. LiteDB is lightweight by design. It should avoid excessive features as such but unexpected especially.

nightroman commented 4 years ago

Maybe a dedicated poll can help to find out what users want? (Unless this is the final authors preference which we should respect of course.)

Joy-less commented 2 days ago

I agree that this is a strange default for LiteDB, since whitespace is generally considered significant in C# strings. I would like to change this functionality in future LiteDB versions, even if it breaks backwards compatibility.