inspirehep / inspire-next

The INSPIRE repo.
https://inspirehep.net
GNU General Public License v3.0
59 stars 69 forks source link

Deleted records appear in search results #2933

Open chris-asl opened 6 years ago

chris-asl commented 6 years ago

Deleted records appear on the normal search.

Expected Behavior

They shouldn't, unless explicitly specified.

Steps to Reproduce (for bugs)

  1. migrate this record https://labs.inspirehep.net/api/literature/1297774 inspirehep migrator one --recid=1297774
  2. Do an empty search and you can find it.

Notes:

  1. If I write deleted:true as a query I can see only that record. But if I write deleted:false doesn't return anything.
    Same happens with ElasticSearch query

    GET records-hep/_search
    {
    "query": {
    "match": {
        "deleted": true
      }
    }
    } 

    (either true or false, same behaviour as described above).

  2. This code seems related and it's being used here.

david-caro commented 6 years ago

This is closely related to the fact that the deleted records are considered as 'existing' by the workflows here: https://github.com/inspirehep/inspire-next/blob/master/inspirehep/modules/workflows/tasks/matching.py#L98

supposedly because the matcher does not filter the deleted ones out.

jacquerie commented 6 years ago

supposedly because the matcher does not filter the deleted ones out.

All you need to do this is to implement a validator that filters them. It's already supported by the config API!

david-caro commented 6 years ago

But the validator is not using the elasticsearch filters right? Is just a python function that is run after right?

jacquerie commented 6 years ago

Is just a python function that is run after right?

Yes, that's right. We could easily add a boolean flag to the configuration that expands to a filter clause, though.