datacite / lupo

MIT License
12 stars 8 forks source link

Build task to clean up GBIF OpenSearch Data #1208

Closed wendelfabianchinsamy closed 4 months ago

wendelfabianchinsamy commented 4 months ago

Build a rails task to delete all GBIF events from the OpenSearch events index where:

  1. The subj.registrantId is "datacite.gbif.gbif"
  2. The relation_type_id is "references"
  3. And it is not the following DOIs (source_doi), 10.15468/QJGWBA, 10.35035/GDWQ-3V93, 10.15469/3XSWXB, 10.15469/UBP6QO, 10.35000/TEDB-QD70, 10.15469/2YMQOZ

Below is the query for fetching all the records that will need to be deleted.

  "query": {
    "bool": {
        "must": [
                "match": {
                    "subj.registrantId": "datacite.gbif.gbif"
                "match": {
                    "relation_type_id": "references"
        "must_not": [
                "terms": {
                    "source_doi": [