datacite / lupo

DataCite REST API
https://api.datacite.org
MIT License
12 stars 8 forks source link

Build task to clean up GBIF OpenSearch Data #1208

Closed wendelfabianchinsamy closed 4 months ago

wendelfabianchinsamy commented 4 months ago

Build a rails task to delete all GBIF events from the OpenSearch events index where:

  1. The subj.registrantId is "datacite.gbif.gbif"
  2. The relation_type_id is "references"
  3. And it is not the following DOIs (source_doi), 10.15468/QJGWBA, 10.35035/GDWQ-3V93, 10.15469/3XSWXB, 10.15469/UBP6QO, 10.35000/TEDB-QD70, 10.15469/2YMQOZ

Below is the query for fetching all the records that will need to be deleted.

{
  "query": {
    "bool": {
        "must": [
            {
                "match": {
                    "subj.registrantId": "datacite.gbif.gbif"
                }
            },
            {
                "match": {
                    "relation_type_id": "references"
                }
            }
        ],
        "must_not": [
            {
                "terms": {
                    "source_doi": [
                        "10.15468/QJGWBA",
                        "10.35035/GDWQ-3V93",
                        "10.15469/3XSWXB",
                        "10.15469/UBP6QO",
                        "10.35000/TEDB-QD70",
                        "10.15469/2YMQOZ"
                    ]
                }
            }
        ]
    }
  }
}