silverstripe / silverstripe-framework

Silverstripe Framework, the MVC framework that powers Silverstripe CMS
https://www.silverstripe.org
BSD 3-Clause "New" or "Revised" License
719 stars 820 forks source link

cascade_delete deletes end of relationship rather than join record #9612

Open chillu opened 3 years ago

chillu commented 3 years ago

The docs claim that cascade_deletes supports all types of relationships (incl many_many). But in the case of many_many through, it deletes the end of that relationship rather than the join record. In the vast majority of cases, this won't be the correct behaviour.

I'd suggest that we solve this by a new ManyManyThroughList->getJoins() method which returns all the join DataObject records. And then in cascade_deletes, we explicitly define MyManyMany.Joins.

See failing unit test below.

From 14d22c64ad4559b6e0a142a1ace3c6f83a1fb8bf Mon Sep 17 00:00:00 2001
From: Ingo Schommer <me@chillu.com>
Date: Fri, 24 Jul 2020 11:58:24 +1200
Subject: [PATCH] WIP cascade_delete on many_many through

---
 tests/php/ORM/CascadeDeleteTest.php               | 52 +++++++++++++++++++++++
 tests/php/ORM/CascadeDeleteTest/ChildObject.php   |  7 ++-
 tests/php/ORM/CascadeDeleteTest/ParentObject.php  | 11 ++++-
 tests/php/ORM/CascadeDeleteTest/ThroughObject.php | 20 +++++++++
 4 files changed, 88 insertions(+), 2 deletions(-)
 create mode 100644 tests/php/ORM/CascadeDeleteTest/ThroughObject.php

diff --git a/tests/php/ORM/CascadeDeleteTest.php b/tests/php/ORM/CascadeDeleteTest.php
index 2d16411ad..e29835671 100644
--- a/tests/php/ORM/CascadeDeleteTest.php
+++ b/tests/php/ORM/CascadeDeleteTest.php
@@ -3,6 +3,8 @@
 namespace SilverStripe\ORM\Tests;

 use SilverStripe\Dev\SapphireTest;
+use SilverStripe\ORM\Tests\CascadeDeleteTest\ChildObject;
+use SilverStripe\ORM\Tests\CascadeDeleteTest\ParentObject;

 /**
  * Test cascade delete objects
@@ -16,6 +18,7 @@ class CascadeDeleteTest extends SapphireTest
         CascadeDeleteTest\ChildObject::class,
         CascadeDeleteTest\GrandChildObject::class,
         CascadeDeleteTest\RelatedObject::class,
+        CascadeDeleteTest\ThroughObject::class,
     ];

     public function testFindCascadeDeletes()
@@ -123,4 +126,53 @@ class CascadeDeleteTest extends SapphireTest
             $child2->Children()
         );
     }
+
+    public function testFindCascadeDeletesThroughRelationship()
+    {
+        $parent = new ParentObject();
+        $parent->write();
+
+        $childOnThroughRelation = new ChildObject(['Title' => 'on through']);
+        $childOnThroughRelation->write();
+        $parent->ThroughChildren()->add($childOnThroughRelation);
+
+        $childOnHasManyRelation = new ChildObject(['Title' => 'on has_many']);
+        $childOnHasManyRelation->write();
+        $parent->Children()->add($childOnHasManyRelation);
+
+        $deletes = $parent->findCascadeDeletes(true);
+
+        $this->assertListContains(
+            [
+                [
+                    'ID' => $childOnHasManyRelation->ID,
+                    'ClassName' => $childOnHasManyRelation->ClassName
+                ],
+            ],
+            $deletes,
+            'Contains has_many relations'
+        );
+
+        $this->assertListContains(
+            [
+                [
+                    'ID' => $childOnThroughRelation->getJoin()->ID,
+                    'ClassName' => $childOnThroughRelation->getJoin()->ClassName
+                ],
+            ],
+            $deletes,
+            'Contains many_many through object'
+        );
+
+        $this->assertListNotContains(
+            [
+                [
+                    'ID' => $childOnThroughRelation->ID,
+                    'ClassName' => $childOnThroughRelation->ClassName
+                ],
+            ],
+            $deletes,
+            'Does not contain actual many_many object'
+        );
+    }
 }
diff --git a/tests/php/ORM/CascadeDeleteTest/ChildObject.php b/tests/php/ORM/CascadeDeleteTest/ChildObject.php
index b4e899c7e..2ec4f93e1 100644
--- a/tests/php/ORM/CascadeDeleteTest/ChildObject.php
+++ b/tests/php/ORM/CascadeDeleteTest/ChildObject.php
@@ -20,7 +20,8 @@ class ChildObject extends DataObject implements TestOnly
     ];

     private static $cascade_deletes = [
-        'Children'
+        'Children',
+        'Parents',
     ];

     private static $has_one = [
@@ -31,4 +32,8 @@ class ChildObject extends DataObject implements TestOnly
     private static $many_many = [
         'Children' => GrandChildObject::class,
     ];
+
+    private static $belongs_many_many = [
+        'Parents' => ParentObject::class . '.ThroughChildren'
+    ];
 }
diff --git a/tests/php/ORM/CascadeDeleteTest/ParentObject.php b/tests/php/ORM/CascadeDeleteTest/ParentObject.php
index 922318b92..62f7dc5e4 100644
--- a/tests/php/ORM/CascadeDeleteTest/ParentObject.php
+++ b/tests/php/ORM/CascadeDeleteTest/ParentObject.php
@@ -14,10 +14,19 @@ class ParentObject extends DataObject implements TestOnly
     ];

     private static $cascade_deletes = [
-        'Children'
+        'Children',
+        'ThroughChildren',
     ];

     private static $has_many = [
         'Children' => ChildObject::class,
     ];
+
+    private static $many_many = [
+        'ThroughChildren' => [
+            'through' => ThroughObject::class,
+            'from' => 'Parent',
+            'to' => 'Child'
+        ],
+    ];
 }
diff --git a/tests/php/ORM/CascadeDeleteTest/ThroughObject.php b/tests/php/ORM/CascadeDeleteTest/ThroughObject.php
new file mode 100644
index 000000000..58890378e
--- /dev/null
+++ b/tests/php/ORM/CascadeDeleteTest/ThroughObject.php
@@ -0,0 +1,20 @@
+<?php
+
+namespace SilverStripe\ORM\Tests\CascadeDeleteTest;
+
+use SilverStripe\Dev\TestOnly;
+use SilverStripe\ORM\DataObject;
+
+class ThroughObject extends DataObject implements TestOnly
+{
+    private static $table_name = 'CascadeDeleteTest_ThroughObject';
+
+    private static $db = [
+        'Title' => 'Varchar',
+    ];
+
+    private static $has_one = [
+        'Parent' => ParentObject::class,
+        'Child' => ChildObject::class,
+    ];
+}
-- 
2.14.3

PRs

robbieaverill commented 3 years ago

Is this a dup of https://github.com/silverstripe/silverstripe-versioned/issues/200?

sminnee commented 3 years ago

Without any core changes you should be able to create a 2nd has_many relationship on your data object that references the through-object instead of the foreign-object. You can then specify cascade_deletes on that relationship instead.

I'd suggest doing this in the short-term because I feel like we're treading on dangerous territory and wouldn't want to squeeze a quick PR in.

chillu commented 3 years ago

Is this a dup of silverstripe/silverstripe-versioned#200?

It's definitely related, but I'd say this bug is more focused in scope. It only concerns itself with the ORM functionality, not Versioned or GridField.

Without any core changes you should be able to create a 2nd has_many relationship on your data object that references the through-object instead of the foreign-object

I'm not smart enough to think through the ramnifications across the entire ORM surface around adding a pseudo-has_many relationship for this ;) I could do this as a custom getter to avoid any ORM interference though.

Anyway, it doesn't change the fact that this is a bug, right? We could clarify this workaround in the docs as a "known limitation" of course.

sminnee commented 3 years ago

There are no bugs or features in this land, just the code's implications :-P

The current implementation matches what happens in many-many land AFAIK.

I agree it's confusing and potentially dangerous, but not necessarily a bug. Or more, simply "fixing" it isn't necessarily going to cause other problems.

kinglozzer commented 3 years ago

One possible concern here: given that the "through" record is now a DataObject in its own right, can we say with absolute certainty that deleting it automatically is safe? I can’t think of a possible use case where you’d have multiple relations running through the same join record, but I like to throw spanners at things

Would/could it be possible to delete the join records, but not the end record?

sminnee commented 3 years ago

This is why cascade_deletes needs to let developers configure these things.

The question is how to distinguish a relation that returns the foreign objects from a relation that returns the intermediary objects.

I'm not convinced that "(relname).Join" is the most intuitive way to refer to the intermediary relationship, as it makes it seem "further" than the foreign objects when it's actually "closer".

If, for example, you were to traverse a list of the join objects you'd probably not link to the foreign object at all. It would behave exactly like a has-many, sense my suggestion to Ingo.

On option would be to create a pair of relation methods for a manymany-through: "(relname)" and "(relname)Join". There a little bit of risk in such a string concatenation that someone overrides that relation name, but I think that's the kind of edge case that people use as a feature sometimes, so it's okay.

Not that I'm not suggesting a mere hack for cascade_deletes. You could, for example, Chuck "(relname)Join" into a gridfield. I'm not sure what the purpose of that would be - my motivation is more to keep a consistent API.

The other thing you could do is simply force users to manually create a has_many. Ingo was worried about side-effects of this, but I'm not - it's in keeping with the ORM's design to do that.

chillu commented 3 years ago

can we say with absolute certainty that deleting it automatically is safe?

To be clear, I'm not proposing that we change the current API or behaviour, even though I think it's not accounting for the most common scenario (deleting the join record rather than the other end of the relationship).

I'm not convinced that "(relname).Join" is the most intuitive way to refer to the intermediary relationship, as it makes it seem "further" than the foreign objects when it's actually "closer".

We've got (object)->Join() already, I thought (object)->(relname)->Joins() would be consistent? Agree that it appears "further away" than (object)->(relname)Joins(), but it has the advantage of keeping the logic contained on ManyManyThroughList rather than packing yet another thing into our DataObject god class.

The other thing you could do is simply force users to manually create a has_many. Ingo was worried about side-effects of this, but I'm not - it's in keeping with the ORM's design to do that.

OK, I'll let you write the PR for this if you want to push for it as a solution ;) Lazy loading, caching, introspection, schema changes, ownership, snapshots, cascade_duplicates - it's just too much to think through for me, sorry. I'd also argue that it's counter-intuitive to developers - in 99% of cases, you'll want to ensure that join records are cleaned up when one side of the relationship is deleted, and this sounds like a lot of hoops to jump through to achieve that.

Option 1: Built-in (object)->(relname)Joins() as pseudo has_many on cascade_deletes

Use the ORM's built in logic to delete the through objects, with auto-mapped relationships. Wouldn't work on many_many without through.

<?php
class ParentObject extends DataObject
{
    private static $many_many = [
        'Items' => [
            'through' => ThroughObject::class,
            'from' => 'Parent',
            'to' => 'Child',
        ]
    ];

    private static $cascade_deletes = [
        'ItemsJoins'
    ];
}

Option 2: Custom pseudo has_many on cascade_deletes

Like Option 1, but leave it to devs to define this.

<?php
class ParentObject extends DataObject
{
    private static $many_many = [
        'Items' => [
            'through' => ThroughObject::class,
            'from' => 'Parent',
            'to' => 'Child',
        ]
    ];

    private static $has_many = [
        'ThroughObjects' => ThroughObject::class
    ];

    private static $cascade_deletes = [
        'ThroughObjects'
    ];

}

Option 3: (object)->relname()->Joins() on cascade_deletes

Create new logic on the ManyManyThroughList. Wouldn't work on many_many without through.

<?php
class ParentObject extends DataObject
{
    private static $many_many = [
        'Items' => [
            'through' => ThroughObject::class,
            'from' => 'Parent',
            'to' => 'Child',
        ]
    ];

    private static $cascade_deletes = [
        'Items.Joins'
    ];

}

Option 4: New onAfterDelete behaviour in core

Do what Damian suggested on a similar discussion around orphaned many_many relationships, and simply move this logic into DataObject->onAfterDelete(). This would change existing logic, and even though it'd be what devs expect 99% of the time, we'd need to make this opt-in for existing projects. Has the advantage of working on many_many as well.

<?php
class ParentObject extends DataObject
{
    private static $many_many = [
        'Items' => [
            'through' => ThroughObject::class,
            'from' => 'Parent',
            'to' => 'Child',
        ]
    ];

    // no cascade_deletes, feature is built-into onAfterWrite()
}

Option 5: Custom getter on cascade_deletes

Rather than as has_many, you could also just write custom code around MyThroughObject::get()->filter(<foreign-keys>), which is how I've solved this right now in my use case.

<?php
class ParentObject extends DataObject
{
    private static $many_many = [
        'Items' => [
            'through' => ThroughObject::class,
            'from' => 'Parent',
            'to' => 'Child',
        ]
    ];

    private static $cascade_deletes = [
        'ThroughObjects'
    ];

    public function ThroughObjects()
    {
        return ThroughObject::get()->filter('ParentID', $this->ID);
    }
}
kinglozzer commented 3 years ago

Somewhat related to this, I just found today that Option 2 is currently required for versioned many-many-through to work properly. Without the extra has_many and cascade_deletes, it’s impossible to “unlink” a many-many-through relation in the live stage. E.g. document -> assignedtag -> tag (with all 3 models versioned):

More info on why that is the case: https://github.com/silverstripe/silverstripe-versioned/issues/116#issuecomment-367130601

tractorcow commented 3 years ago

I came here to add to the conversation, but @sminnee already said everything I was about to say, to +1 to his evaluation.

GuySartorelli commented 1 month ago

I'd say at a minimum, the current behaviour needs to be clearly documented which I don't think it currently is. I don't think we can make any functional changes outside of a major release.

Going forward though, it's worth looking at our options.

The intuitive thing for me (and this could be done in a major release) would be that $cascade_deletes works the same way for all relations (i.e. it deletes the record at the end of the relationship), and that possibly a new separate config is used (let's call it $cascade_unlink_relation for conversation's sake) to cascade delete the relation.

There should be no reason for this configuration to differentiate between a basic many_many relation and a many_many through relation.