jonwagner / Insight.Database

Fast, lightweight .NET micro-ORM
Other
861 stars 145 forks source link

CreateClassDeserializerDynamicMethod: Unexpected behavior when all fields in a record are DbNull. #446

Closed dbakuntsev closed 4 years ago

dbakuntsev commented 4 years ago

When all fields in a record are DbNull, the dynamic method generated by CreateClassDeserializerDynamicMethod() returns a default(T), which would be a null reference for any reference type. The expected behavior is to have a new instance with all mapped fields and properties initialized to their default values.

A workaround is to use value types (i.e. structs), or check the output and replace null references with new empty instances in the caller code.

https://github.com/jonwagner/Insight.Database/blob/6685b3a68e94b6047a5296fc2bda366f968964d6/Insight.Database.Core/CodeGenerator/ClassDeserializerGenerator.cs#L201-L212

jonwagner commented 4 years ago

This was a design choice for the common use case where sub-objects are selected with LEFT JOINS, and a missed join returns dbnull for all of the columns. The expected behavior in that case is for the sub-object to be null. It's documented in the wiki that way, so people are probably relying on the current behavior.

But you do have a point that there are some cases where you would rather have a populated object. I'm not sure how we can determine that without breaking the original behavior. Any suggestions?

dbakuntsev commented 4 years ago

I agree it makes perfect sense for the right (non-mandatory) part of a left join; it doesn't make any sense for the left (mandatory) part of the same join, though.

As implemented, this is a bug, because the null reference is returned for a record that has an ID while all other values are DbNull (that is, the record exists). If the right side of the non-mandatory part of a left join is missing, it wouldn't have an ID, either, and everything will be DbNull (that is, the record does not exist). Maybe this is the determining factor you are looking for?

jonwagner commented 4 years ago

Aha - yes I agree if an ID field is returned, you should get an empty object. Can you write up a few cases with different behaviors? I was thinking of merging another PR this week and having some test cases here would speed up fixing this one.

dbakuntsev commented 4 years ago

Here's a few unit tests derived from the bug we've encountered recently.

It appears that the bug is in a specific use case when a subordinate set of records is loaded and matches to the primary set of records (via a multiple result sets mechanism rather than a join). The subordinate record has a non-null foreign key ID, but the remainder of columns is null. Again, if the record exists, perhaps it should resolve to a non-null reference.

class TestDataPrimary
{
    public int ID { get; set; }
    public int? PrimaryField1 { get; set; }
    public IList<TestDataForeign> ForeignRecords { get; set; }
}

class TestDataForeign
{
    public int? ForeignField1 { get; set; }
}

[Test]
public void Test1_WorksOK()
{
    var results = Connection().QuerySql(@"
        CREATE TABLE #TestDataPrimary ([ID] [int] NOT NULL, [PrimaryField1] [int])
        CREATE TABLE #TestDataForeign ([PrimaryID] [int] NOT NULL, [ForeignField1] [int])
        INSERT INTO #TestDataPrimary (ID, PrimaryField1) VALUES (1, 222)
        INSERT INTO #TestDataForeign (PrimaryID, ForeignField1) VALUES (1, 333)
        SELECT * FROM #TestDataPrimary t;
        SELECT * FROM #TestDataForeign s
", null, Query.Returns(Some<TestDataPrimary>.Records).ThenChildren(Some<TestDataForeign>.Records, id: a => a.ID));

    Assert.IsNotNull(results);
    Assert.AreEqual(1, results.Count);

    var testDataPrimary = results[0];
    Assert.AreEqual(1, testDataPrimary.ID, "ID should not be overwritten by sub-object id");
    Assert.AreEqual(222, testDataPrimary.PrimaryField1);

    Assert.IsNotNull(testDataPrimary.ForeignRecords);
    Assert.AreEqual(1, testDataPrimary.ForeignRecords.Count);
    Assert.IsNotNull(testDataPrimary.ForeignRecords[0]);
    Assert.AreEqual(333, testDataPrimary.ForeignRecords[0].ForeignField1);
}

[Test]
public void Test2_Fails()
{
    var results = Connection().QuerySql(@"
        CREATE TABLE #TestDataPrimary ([ID] [int] NOT NULL, [PrimaryField1] [int])
        CREATE TABLE #TestDataForeign ([PrimaryID] [int] NOT NULL, [ForeignField1] [int])
        INSERT INTO #TestDataPrimary (ID, PrimaryField1) VALUES (1, 222)
        INSERT INTO #TestDataForeign (PrimaryID, ForeignField1) VALUES (1, NULL)
        SELECT * FROM #TestDataPrimary t;
        SELECT * FROM #TestDataForeign s
", null, Query.Returns(Some<TestDataPrimary>.Records).ThenChildren(Some<TestDataForeign>.Records, id: a => a.ID));

    Assert.IsNotNull(results);
    Assert.AreEqual(1, results.Count);

    var testDataPrimary = results[0];
    Assert.AreEqual(1, testDataPrimary.ID, "ID should not be overwritten by sub-object id");
    Assert.AreEqual(222, testDataPrimary.PrimaryField1);

    Assert.IsNotNull(testDataPrimary.ForeignRecords);
    Assert.AreEqual(1, testDataPrimary.ForeignRecords.Count);
    Assert.IsNotNull(testDataPrimary.ForeignRecords[0]);
    Assert.IsNull(testDataPrimary.ForeignRecords[0].ForeignField1);
}

[Test]
public void Test3_WorksOK()
{
    var results = Connection().QuerySql(@"
        CREATE TABLE #TestDataPrimary ([ID] [int] NOT NULL, [PrimaryField1] [int])
        CREATE TABLE #TestDataForeign ([PrimaryID] [int] NOT NULL, [ForeignField1] [int])
        INSERT INTO #TestDataPrimary (ID, PrimaryField1) VALUES (1, NULL)
        INSERT INTO #TestDataForeign (PrimaryID, ForeignField1) VALUES (1, 333)
        SELECT * FROM #TestDataPrimary t;
        SELECT * FROM #TestDataForeign s
", null, Query.Returns(Some<TestDataPrimary>.Records).ThenChildren(Some<TestDataForeign>.Records, id: a => a.ID));

    Assert.IsNotNull(results);
    Assert.AreEqual(1, results.Count);

    var testDataPrimary = results[0];
    Assert.AreEqual(1, testDataPrimary.ID, "ID should not be overwritten by sub-object id");
    Assert.IsNull(testDataPrimary.PrimaryField1);

    Assert.IsNotNull(testDataPrimary.ForeignRecords);
    Assert.AreEqual(1, testDataPrimary.ForeignRecords.Count);
    Assert.IsNotNull(testDataPrimary.ForeignRecords[0]);
    Assert.AreEqual(333, testDataPrimary.ForeignRecords[0].ForeignField1);
}

For the reference, line Assert.IsNotNull(testDataPrimary.ForeignRecords[0]); is breaking the failing unit test.

Furthermore, even if PrimaryID is introduced into TestDataForeign class, it won't be mapped (its value will remain default - zero), and it won't affect the outcome of Test2_Fails unit test.

class TestDataForeign
{
    public int PrimaryID { get; set; }
    public int? ForeignField1 { get; set; }
}

I'm not entirely sure why it shouldn't be mapped, but this is perhaps a subject for a different discussion.

jonwagner commented 4 years ago

Aha - that's the key. When a sub-object is read and the linking ID isn't part of the structure, we actually wrap it in a structure called a Guardian:

class Guardian<...> {
      public <> ID;
      public SubObject Object;
}

So even though there is an ID in the record, we're not using that to determine if the join failed (ID is null) and if it's an empty object (ID is not null). Definitely can be fixed... and then I'll look at why it's not mapping the PrimaryID when you add it. It's probably somewhat related.

jonwagner commented 4 years ago

More details...There are different code paths that get executed depending on the structure of the sub-object.

        {
            public int? ForeignField1 { get; set; }
        }

^ This object doesn't have a parent ID in it, so we wrap it in a Guardian<,>. And there was a bug where the sub-object was skipped when all of the other fields were null.

                public class TestDataForeign
        {
            public int PrimaryID { get; set; }
            public int? ForeignField1 { get; set; }
        }

^ In this case, it looks like we have an ID but Insight can't determine that PrimaryID is the parent's ID. There are rules on the wiki that explain the auto-detection. In this case, we still wrap it in Guardian<,> and the bug still is in effect.

        {
            [ParentRecordId]
            public int PrimaryID { get; set; }
            public int? ForeignField1 { get; set; }
        }

^ Here we've explicitly tagged the field as a parent ID. Since we have an ID field, Insight doesn't wrap the sub-object in a Guardian<,> and the null bug doesn't surface.

I think there are other ways of signalling that PrimaryID is the parent's ID, such as passing a callback.

jonwagner commented 4 years ago

I've pushed the fix to the Guardian<,> cases above into the main branch and all test cases pass.

See if that fixes things for you. I'll do a build sometime in the next few days.

dbakuntsev commented 4 years ago

Thank you for quick investigation and code changes!

Oh, my. Had we just used [ParentRecordId] decorator, we wouldn't have stumbled into this condition in the first place. Well, happy to be of assistance finding a really really really obscure bug :)

With the new change, I see that [ParentRecordId] decorator is still required for the property PrimaryID to receive a value. This is unexpected behavior as well; that is, the expectation is if there's a column in a result set, it should be mapped onto a property of the same name in the target object. Is there a reason why it is not mapped under the circumstances?

jonwagner commented 4 years ago

In the case where we're merging a multi-recordset into parent/child relationship, we have to pick a field in the sub-object to map to the parent record. The logic works well in most cases. It's documented in the wiki somewhere but it's something like:

  1. Explicit mapping via ParentRecordId, a delegate passed to ThenChildren, or specified in the Recordset attribute for auto-interface implementation.
  2. Guess by name like ParentID or <ParentClass>ID - there are a few rules here that cover common scenarios. You can see these in ChildMapperHelper.cs
  3. Assume the first column is the parent ID, but there's no field in the corresponding .net object, so we create a temporary field in the Guardian structure to hold it.

In case #3, the field isn't available for mapping into the sub-object. :thinking: There might be a way to let the field be available for the sub-object as well. Or using forward-only record readers might prevent it. Let me re-open this and see if I can fix that while I'm here.

jonwagner commented 4 years ago

As I expected, using SequentialRead keeps us from re-using the column. I'll play around with some additional tweaking...

jonwagner commented 4 years ago

The bug is fixed in 6.3.2 (released).

I'll keep looking into the SequentialRead/first column issue, but since it's somewhat of an edge case with plenty of options for explicit configurations, I won't rush through a big refactor to make it happen.