Open StCyr opened 6 months ago
Hi @StCyr
Do you know what version of roadmap you are using where you encountered this bug?
Or if you are not sure of the version, do you know roughly when you did the last fork / took the last update from the DMPRoadmap code base?
Don DMPonline Developer
DMP Assistant last had an occurence of this bug on May 8, 2024, which included DMPRoadmap v4.0.2
:
The following queries were performed on a recent recent data dump from DMP Assistant's production environment:
SELECT COUNT(id)
FROM answers;
---------------------------
67327
*************************************************************
SELECT COUNT(*)
FROM answers
JOIN (
SELECT plan_id, question_id
FROM answers
GROUP BY plan_id, question_id
HAVING COUNT(*) > 1
) AS duplicates
ON answers.plan_id = duplicates.plan_id AND answers.question_id = duplicates.question_id;
----------------------------
22
So there are 11 (22/2) entries with "duplicates" within the answers table. So only ~0.00016% (11/67327) of the answers entries are affected.
This table shows how close the created_at
values are between the answers entries and their duplicates (max = 0.014105 seconds).
id | duplicate_id | plan_id | question_id | created_at | duplicate_created_at | time_difference_seconds |
---|---|---|---|---|---|---|
26035 | 26036 | 6914 | 67418 | 2021-04-02 19:54:36 | 2021-04-02 19:54:36 | 0 |
34772 | 34773 | 8424 | 68834 | 2021-12-17 23:01:46 | 2021-12-17 23:01:46 | 0 |
41114 | 41115 | 9347 | 69714 | 2022-06-04 02:52:42 | 2022-06-04 02:52:42 | 0 |
41124 | 41125 | 9347 | 69723 | 2022-06-04 04:46:48 | 2022-06-04 04:46:48 | 0 |
49172 | 49173 | 10912 | 68316 | 2023-03-22 15:26:56.835211 | 2023-03-22 15:26:56.842863 | 0.007652 |
49711 | 49712 | 11005 | 74820 | 2023-04-03 07:10:32.197305 | 2023-04-03 07:10:32.205013 | 0.007708 |
52035 | 52036 | 11366 | 68127 | 2023-05-23 19:21:09.952598 | 2023-05-23 19:21:09.964106 | 0.011508 |
54484 | 54485 | 11785 | 72244 | 2023-08-21 14:01:49.753142 | 2023-08-21 14:01:49.767247 | 0.014105 |
55605 | 55606 | 12007 | 69720 | 2023-09-20 00:20:59.242693 | 2023-09-20 00:20:59.24843 | 0.005737 |
60793 | 60794 | 12632 | 75680 | 2023-11-27 03:02:59.964372 | 2023-11-27 03:02:59.970709 | 0.006337 |
67389 | 67390 | 13558 | 72638 | 2024-05-08 23:18:26.696402 | 2024-05-08 23:18:26.700717 | 0.004315 |
We can see there is a very small percentage of duplicates, and the created_at
difference is very small between the answers and their corresponding duplicates. Given all of this, it seems quite possible that we're sometimes encountering a race condition which causes the question uniqueness validation check to sometimes fail. When two records are being saved nearly simultaneously, the validation check might pass for both before either is committed to the database, leading to the insertion of duplicate entries.
This seems even more likely when we factor in the 'auto-save' feature being used for answers. If a manual save was executed and an auto-save was triggered at nearly the same moment, then the aforementioned scenario could occur.
Perhaps a migration should be created to enforce this uniqueness constraint at the db level?
somehow one of our user succeeded to create 2 answers for a single question in a plan.
this breaks the plan's copy function:
After some investigation, we've found the offending entries:
deleting one of them solved the issue:
That would be good to put a unicity constraint on the (plan_id,question_id) tulpe on the answers table