Center-for-Research-Libraries / vufind

CRL Implimentation of VuFind frontend for FOLIO. A library resource discovery portal designed and developed for libraries by libraries
GNU General Public License v2.0
1 stars 0 forks source link

Codify/verify process for deletions and suppressions #35

Open ryan-jacobs opened 2 years ago

ryan-jacobs commented 2 years ago

Deletions and suppressions must be handled a special way to be processed correctly into VuFind. We need to:

mmabrahamson commented 2 years ago

You'll want to make sure you update oai.ini with a find/replace setting that will remove the oai prefix from your records. For some reason, this doesn't cause issues with import, but can cause deletion to fail. An example would be something like: idSearch[] = "/^oai:domain.folio.ebsco.com:fs00001111\//" idReplace[] = ""

In addition, you're right, reharvesting every now and then (usually a reharvest once a semester) is just a good idea to baseline things and make sure that nothing was missed through another process.

ryan-jacobs commented 2 years ago

I've implemented our tenant info in the idSearch[] of a dev environment. I'll test a couple new harvests before moving this to prod.

Note that this info is in our private oai.ini file, so there should not be any VC impacts here.

ryan-jacobs commented 2 years ago

Using the suppressed flag on a bib record will trigger a delete request through OAI-PMH harvest, which will allow a direct test of the deletion process in VuFind. If successful, record can be restored by unchecking suppressed once again.

AndyElliottCRL commented 2 years ago

Here are DLIR titles that can be abused at will: DLIR_excludes.xlsx

ryan-jacobs commented 2 years ago

Steve indicated that there are some DLIR suppressed records that are still not being removed from discovery. This indicates that there is still something missing in the harvest deletion processing.

ryan-jacobs commented 2 years ago

@mmabrahamson, we have done some additional testing and I'm not seeing any signs that FOLIO is sending over any information when a record is suppressed. Suppressing (and unsuppressing) a record does trigger a specific record update over OAI-PMH but it's always the same as any other update. I never see any .delete records nor do I see any form of "discovery" flag on the record sent.

I've tried different variations of the following FOLIO settings, but still do not see any changes to this OAI-PMH behavior:

Image

We value any insights you may be able to provide on the FOLIO handling that's involved here.

mmabrahamson commented 2 years ago

I'm not sure why you're not seeing a delete when you skip suppressed items entirely and then suppress an item and do an incremental update. It should count it as a delete from my experience.

If you're using the Transfer with flag, then what you should see in VuFind for a suppressed bib/instance is a 999t1 value. Unsuppressed bib/instance will have a 999t0.

Assuming that you do see the 999t appearing correctly, you'll need to do a couple of other things.

In marc_local.properties, add this string: suppressed_str = 999t

in local/config/vufind/searches.ini, you'll need to add this line under the Raw Hidden Filters heading:

[RawHiddenFilters] 0 = "-suppressed_str:1"

This was based on advice I got from the core VuFind team to handle this specific issue.

Unfortunately to make this work, after the changes are in place, you'll need to do a full reimport to reindex everything.

As a separate note, if you're never seeing .delete files coming from OAI, we may need to check with Kyle about it. I'm not sure why they wouldn't be coming across, but can look into it more.

ryan-jacobs commented 2 years ago

Thanks @mmabrahamson, I'm going to run a few more tests.

It sounds like you chose to use the "transfer records with discovery flag" option for at least one project case. Assuming there is a way to get both options working for suppression handling (and it sounds like there should be) what was the deciding factor to go in that direction as opposed to requesting .delete records?

ryan-jacobs commented 2 years ago

Ok, so using the "Transfer Suppressed Records with Discovery Flag" setting in FOLIO I'm now able to get a 999t value to come over consistently, so that's progress.

When using the "Skip suppressed from discovery records" setting in FOLIO I continue to see no .delete records generated when flipping a record to suppressed. It seems to me that this setting is actually pretty literal in that it just skips over any suppressed records when generating OAI-PMH streams (making it invisible to OAI-PMH). So it will happily keep a suppressed record out of discovery, but as soon as a record gets into the VuFind index, there's no way to get it out based on a change in the suppressed property.

So unless I'm still missing something about that "Skip suppressed from discovery records" setting, or there is something unique about the state of our tenant, it seems like our only option is to leverage that 999t value. @mmabrahamson, thanks for your additional comments on that approach. So it looks like we need to continue indexing those suppressed records in VuFind, but that searches.ini change can keep anything with a 999t of 1 out of all results. I think the only thing missing from your hints is the addition of a solr field definition for the new suppressed_str field.

ryan-jacobs commented 2 years ago

Ok, I think we have a pathway forward on this one now. I have a working proof that uses the 999t value in dev (and PR #157).

@AndyElliottCRL , you previously shared a long list (55+k) of DLIR entries that need to be suppressed, but I see that we only have about 12 inventory records suppressed in FOLIO currently (see screenshot). Now that we have a pathway toward a suppression solution on catalog.crl.edu does that trigger a data effort to update the suppressed property for these DLIR bib records?

Screen Shot 2022-09-13 at 5 38 05 PM
AndyElliottCRL commented 2 years ago

Right, all those should have been suppressed when first loaded, but the MilCat suppressed flag didn't work in the migration. Apparently. Anyway we can't suppress that many records by hand so it will probably have to be done with API calls, and we haven't built that yet. Goes under new issue # two from CRL_FOLIO_API. If these records that are already suppressed in Folio are properly not found in VF then we can start work on suppressing the rest, but there are numerous things in the queue before it would be ready.

ryan-jacobs commented 2 years ago

Thanks @AndyElliottCRL. My understanding is that the final goal for those DLIR records may be full deletion. I'm still not sure if that's technically feasible though.

From a discovery perspective there should really be no difference, as both suppressions and deletions should not appear in any public interface. The VuFind mechanism for processing a suppression will likely be different to processing a deletion though, as it seems that OAI-PMH must encode both cases differently. Anyway, this issue should allow us to address both. I think we have things under control for VuFind suppression, but have not yet had any deletion cases to formally test.

ryan-jacobs commented 2 years ago

Recent merge addresses suppression needs, but let's leave this issue open as a place to still discuss and verify deletion processing.