osulp / Scholars-Archive

ScholarsArchive@OSU, institutional repository for Oregon State University
https://ir.library.oregonstate.edu/
14 stars 3 forks source link

Files deleted during review are not actually deleted #2572

Open carakey opened 7 months ago

carakey commented 7 months ago

Descriptive summary

A number of problematic fileset objects were found after they recently failed fixity checks. These appear to be versions of files that were deleted and replaced during the review process prior to approval. Searching by the depositor and comparing filenames and dates of deposit, I was able to find a current work that appears to be the original parent for each problem fileset, with comment history suggesting changes were requested and had been made, and at least one current, functional fileset. This would indicate that the original file was supposed to have been deleted but continues to persist in the system.

It is not clear whether this is the same issue as the ghost filesets in #2530 as these fileset objects have more extensive metadata, including full characterization information. The extent of the problem is not presently known.

One thing that stands out in the metadata is that 12 out of the 14 fixity failures have a timestamp value of 2021-11-14 between 14:44 and 16:58.

Expected behavior

Deleting a fileset object removes it entirely from the system.

Actual behavior

The filesets identified in February and March 2024 fixity check reports remain findable by ID in Solr and continue to be subject to fixity checking after presumed deletion.

Related work

2571 - deleting these specific fileset objects from the fixity failure reports

2530 - more fileset objects that refused to be deleted

carakey commented 5 months ago

Tested and did not replicate the problem:

Following this test, none of the three PIDs were found in Solr.