Closed SeedDMS closed 11 months ago
Thanks for the report! The empty deleted dedup records are expected. They ensure that when RecordManager is used to update a Solr index, any dedup records marked deleted are also deleted from the index. For the same reason deleted records are kept in RecordManager's database for a while. The default retention period is 14 days, but you can use the days-to-keep
parameter with purge-deleted
to control it.
Records not getting deduplicated when you re-imported DS1.xml is not expected, however, so I'll need to investigate that. I'll update when I have more information. In a pinch you could use ./console records:deduplicate --all --source=DS1
to force deduplication.
Thanks for the ./console records:deduplicate --all --source=DS1
hint. That fixes the deduplication problem in my simplified example. It may even help in my original problem where 20 data sources where imported. The initial import and deduplikation always worked, but updating some of the sources led to more and more duplicates. So, I'll try to force deduplication.
I've committed a fix for the example case. If your update procedure involved marking all deleted and then loading a new file, this should fix the case as well. However, if you're seeing trouble getting newly added records to be deduplicated, there must be something else amiss.
Just a final note. We are also running an version 1.9 RecordManager and we discovered the same behaviour. Backporting the commit into Base/Controller/StoreRecordTrait.php
seems to have fixed it as well.
We had some problems with deduplication after updating a datasource. There was just no deduplication for those records anymore. I tried to boilded in down to a simple 2 datasources problem.
At this point everything is fine. The table
dedup
contains 5336 records. Next I runThis marks all records in
record
anddedup
as deleted. The fieldids
in the dedup record is also cleared and there is no reference to the dedup record inredord.dedup_id
anymore. That looks ok as well. Then I try to actually purge the deleted records.which doesn't do anything. I would expect the formerly marked deleted records to be deleted, but both tables
dedup
andrecord
remain unchanged. Doesn't appear to be a real problem, so I import DS1 again.and all the formerly marked as deleted records in table
record
aren't marked as deleted anymore. So I tried aagain, but that doesn't do anythink. I doesn't even try to deduplicate. What went wrong and secondly, why have those records in table
dedup
never been deleted? There are basically empty shells marked as deleted and not referencing a record?I'm using mysql.