DataObjects-NET / dataobjects-net

https://dataobjects.net
MIT License
60 stars 23 forks source link

Bulk Delete() over an EntitySet<T> with heterogeneous causes ReferentialConstraintViolation #292

Open ondrejtucny opened 1 year ago

ondrejtucny commented 1 year ago

The bulk Delete() method fails when deleting members of an EntitySet which contains ancestors of T:

  Expected: No Exception to be thrown
  But was:  <Xtensive.Orm.ReferentialConstraintViolationException: SQL error occured.
Storage error details 'Entity: ServiceLevelOne; Field: Id (FieldInfo);'
SQL error details 'Type: ReferentialConstraintViolation; Database: DO-Tests; Table: ServiceLevelOne; Column: Id; Constraint: FK_ServiceLevelOne_ServiceLevel;'
Query 'DELETE FROM [dbo].[ServiceLevel] WHERE  ([ServiceLevel].[Owner.Id] = @p0_0); [p0_0='1']'
Original message 'The DELETE statement conflicted with the REFERENCE constraint "FK_ServiceLevelOne_ServiceLevel". The conflict occurred in database "DO-Tests", table "dbo.ServiceLevelOne", column 'Id'.
The statement has been terminated.'
 ---> Microsoft.Data.SqlClient.SqlException (0x80131904): The DELETE statement conflicted with the REFERENCE constraint "FK_ServiceLevelOne_ServiceLevel". The conflict occurred in database "DO-Tests", table "dbo.ServiceLevelOne", column 'Id'.
The statement has been terminated.
   at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at Microsoft.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption, Boolean shouldCacheForAlwaysEncrypted)
   at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean isAsync, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest)
   at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry, String method)
   at Microsoft.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry, String methodName)
   at Microsoft.Data.SqlClient.SqlCommand.ExecuteNonQuery()
   at Xtensive.Orm.Providers.StorageDriver.<>c.<ExecuteNonQuery>b__65_0(DbCommand c, CommandBehavior cb) in /_/Orm/Xtensive.Orm/Orm/Providers/StorageDriver.Operations.cs:line 400
   at Xtensive.Orm.Providers.StorageDriver.ExecuteCommand[TResult](Session session, DbCommand command, CommandBehavior commandBehavior, Func`3 action) in /_/Orm/Xtensive.Orm/Orm/Providers/StorageDriver.Operations.cs:line 443
ClientConnectionId:9b2af3d9-c629-4c78-9a81-f3d833a372d4
Error Number:547,State:0,Class:16
   --- End of inner exception stack trace ---
   at Xtensive.Orm.Providers.StorageDriver.ExecuteCommand[TResult](Session session, DbCommand command, CommandBehavior commandBehavior, Func`3 action) in /_/Orm/Xtensive.Orm/Orm/Providers/StorageDriver.Operations.cs:line 448
   at Xtensive.Orm.Providers.StorageDriver.ExecuteNonQuery(Session session, DbCommand command) in /_/Orm/Xtensive.Orm/Orm/Providers/StorageDriver.Operations.cs:line 400
   at Xtensive.Orm.Services.QueryCommand.ExecuteNonQuery() in /_/Orm/Xtensive.Orm/Orm/Services/QueryBuilding/QueryCommand.cs:line 69
   at Xtensive.Orm.BulkOperations.BulkDeleteOperation`1.ExecuteInternal() in /_/Extensions/Xtensive.Orm.BulkOperations/Internals/BulkDeleteOperation.cs:line 34
   at Xtensive.Orm.BulkOperations.Operation`1.Execute() in /_/Extensions/Xtensive.Orm.BulkOperations/Internals/Operation.cs:line 40
   at Xtensive.Orm.BulkOperations.BulkExtensions.Delete[T](IQueryable`1 query) in /_/Extensions/Xtensive.Orm.BulkOperations/BulkExtensions.cs:line 26
   at Xtensive.Orm.BulkOperations.Tests.Issues.DeleteHeterogeneousEntitySet.<>c__DisplayClass0_1.<DeleteTest>b__0() in /_/Extensions/Xtensive.Orm.BulkOperations.Tests/Issues/DeleteHeterogeneousEntitySet.cs:line 74
--- End of stack trace from previous location ---
   at NUnit.Framework.Internal.ExceptionHelper.Rethrow(Exception exception)
   at NUnit.Framework.Internal.Reflect.DynamicInvokeWithTransparentExceptions(Delegate delegate)
   at NUnit.Framework.Internal.ExceptionHelper.RecordException(Delegate parameterlessDelegate, String parameterName)>

This is the unit test to reproduce the issue:

using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using NUnit.Framework;
using Xtensive.Orm.BulkOperations.Tests.Issues.Model;
using Xtensive.Orm.Model;

namespace Xtensive.Orm.BulkOperations.Tests.Issues.Model
{
  [HierarchyRoot(InheritanceSchema.Default)]
  public abstract class ServiceLevel : AbstractBaseForServiceLevel
  {
    [Field, Key]
    public int Id { get; set; }

    [Field]
    public ServiceGroup Owner { get; set; }
  }

  public class ServiceLevelOne : ServiceLevel
  {
    [Field]
    public string FieldOne { get; set; }
  }

  public class ServiceLevelTwo : ServiceLevel
  {
    [Field]
    public string FieldTwo { get; set; }
  }

  [HierarchyRoot]
  public class ServiceGroup : Entity
  {
    [Field, Key]
    public int Id { get; set; }

    [Field]
    [Association(PairTo = "Owner")]
    public EntitySet<ServiceLevel> Services { get; set; }
  }

  public abstract class AbstractBaseForServiceLevel : Entity
  {
    [Field]
    public bool Active { get; set; }
  }
}

namespace Xtensive.Orm.BulkOperations.Tests.Issues
{
  internal class DeleteHeterogeneousEntitySet : BulkOperationBaseTest
  {
    [Test]
    public void DeleteTest()
    {
      int key;
      using (var session = Domain.OpenSession())
        using (session.Activate())
          using (var transaction = session.OpenTransaction()) {
            var owner = new ServiceGroup();
            key = owner.Id;
            _ = new ServiceLevelOne { Active = false, Owner = owner, FieldOne = "x" };
            _ = new ServiceLevelTwo { Active = false, Owner = owner, FieldTwo = "y" };
            transaction.Complete();
          }

      using (var session = Domain.OpenSession())
        using (var transaction = session.OpenTransaction()) {
          Assert.DoesNotThrow(
            () => session.Query.Single<ServiceGroup>(key).Services.Delete());
        }
    }
  }
}
alex-kulakov commented 1 year ago

Hello @ondrejtucny.

Currently, Bulk operations does not support hierarchies that contains more than one type represented in from of table. This is mostly because DELETE statement is for content of one table and this extension keeps it very simple. I believe there are clever ways to perform such operations, but it will no longer be 1 statement and if it is not than it can be performed partially due to some errors.

Then, it is time-consuming task to develop. Say we have three tables which correspond with entities: a Root, a Middle and a Leaf, where the Root is a hierarchy root entity, the Middle is direct ancestor of the Root and Leaf is direct ancestor of the Middle. Like so...

[HierarchyRoot]
public class Root : Entity
{
  [Field, Key]
  public long Id { get; private set; }

  [Field]
  public DateTime CreationDate { get; set; }

  [Field]
  public string Name { get; set; }
}

public class Middle : Root
{
  [Field]
  public string Description { get; set; }

  [Field]
  public double SomeCoefficient {get; set; } 
}

public class Leaf : Middle
{
  [Field]
  public int PriorityIndex{ get; set; }

  [Field]
  public string PriorityIndexName { get; set; }
}

Depending on the Hierarchy's inheritance schema, fields will be spread differently. Lets take the default inheritance schema - ClassTable, and see what happens. In this mode corresponding tables contains fields of PR and the fields that was declared in the type the table represents.

Say the query to delete rows looks like session.Query.All<Root>().Where(r => r.CreationDate >= DateTime.UtcNow.AddDays(-5)).Delete(); Base query includes all roots, middles, and leaves, so the query in reality would affect three tables. We should also delete rows from Leaf then from Middle and then from Root. So, as I see it, we would need to select all the primary keys that were going to be deleted from Root because the field is in Root table. Then we would need to pass this list of PKs to DELETE statement for Leaf table, then one for the Middle table and finally delete them from Root table. This would be 4 queries.

In general case we would work with thousands or millions of rows so possibly we would need a temporary buffer for millions of PK values. Temporary tables might work as a buffer storage but not all of RDBMSs have this feature, so we would need to load PKs to a local collection or deny the operation.

Ok, what about transactions? this is another condition we need to take into account. What if another transaction changed the CreationDate of a row that out of our selection to a value that fits the condition in the query above between SELECT and final DELETE statement? A row within our selection of PKs could also change CreationDate value and became out of selection so we would false-delete data. What if a row with matching value were inserted?

This is only three entities in linear hierarchy structure, and it raised so many problems to solve, some of them may be unsolvable, tree-like hierarchies will make it even more complicated.

This is a lot of time for development. Unfortunately, I have limited time to spend on tasks, so this is not my priority right now to make BulkOperations support multi-table operations.