4teamwork / ftw.solr

Solr integration for Plone
5 stars 5 forks source link

Metadata doesn't get updated when idxs is specified in reindexObject call #136

Open lukasgraf opened 5 years ago

lukasgraf commented 5 years ago

If any code calls context.reindexObject(idxs=['...']) with a list of indexes, ftw.solr only updates those fields, but not ones that correspond to catalog metadata.

This is in opposition to the catalog's behavior, which will always update metadata on any context.reindexObject() call.

This becomes a problem when code is relying on the fact that this is the catalog's behavior, and picks a cheap index like getId to accomplish a reindexing of basically just the metadata. This "use a cheap index" trick is used by ftw.bumblebee for example.

It's unfortunate that the catalog behaves this way, and doesn't provide a clean API to control updating of metadata. But because of that, it's a fact that there's plenty of code around that either deliberately abuses that side-effect of reindexObject(), or at least needs it to work properly.

In my opinion, the default out-of-the box behavior in ftw.solr needs to prioritize consistency.

Unfortunately, fixing this behavior in ftw.solr means that we probably will incur quite a performance penalty. Atomic updates would be much less "atomic" than before, because they would always need to include all metadata fields as well (=fields in the Solr schema for which a Plone catalog metadata column exists with the same name).

lukasgraf commented 5 years ago

As discussed: For now, we will address this by specifically adding the name of the metadata column to be indexed to the respective reindexObject() calls' list of idxs. This will lead to the ZCatalog quietly filtering out index names it doesn't know about, and the name of the metadata columns still getting passed on to ftw.solr (which doesn't make a distinction between metadata and indexes anyway).