geonetwork / core-geonetwork

GeoNetwork is a catalog application to manage spatially referenced resources. It provides powerful metadata editing and search functions as well as an interactive web map viewer. It is currently used in numerous Spatial Data Infrastructure initiatives across the world.
http://geonetwork-opensource.org/
GNU General Public License v2.0
428 stars 489 forks source link

LuceneSearcher - getAllMetadataFromIndexFor method processes even deleted documents #5369

Open josegar74 opened 3 years ago

josegar74 commented 3 years ago

LuceneSearcher.getMetadataFromIndex are used to retrieve directly the information from the index from the metadata id or uuid, the method uses LuceneSearcher.getMetadataFromIndex with a filter of type NoFilterFilter, but that filter seem retrieving even the deleted versions of the documents for a metadata (in case the index is not optimised), what can cause outdated information retrieved.

https://github.com/geonetwork/core-geonetwork/blob/cd80df109c3ee815eae6f3b2f1fe06cb884bd6de/core/src/main/java/org/fao/geonet/kernel/search/LuceneSearcher.java#L957-L958

Changing the filter to the following seem working fine, not retrieving outdated versions of the documents:

Filter filter = new DuplicateDocFilter(new MatchAllDocsQuery());

@fxprunayre are you familiar with this code? Before doing a pull request want to confirm I'm not missing something about the usage of NoFilterFilter.

fxprunayre commented 3 years ago

No suggestion Jose.