4teamwork / ftw.solr

Solr integration for Plone
5 stars 5 forks source link

SolrDocument returns None for existing fields with missing value. #182

Closed njohner closed 3 years ago

njohner commented 3 years ago

SolrDocument supports a list of fields, allowing it to be made schema aware in the SolrContentListing (note that it already supported fields in the __init__ method, but did nothing with it). This allows to return None when trying to access an attribute on a SolrDocument for which there is no value in its data but which is present in the schema. This is correct as solr does not return fields with no value in its response.

What is a bit unfortunate IMO, is that SolrContentListingObject does not overwrite __getattr__, so that getting an attribute on a SolrContentListingObject, does not check for that attribute on its SolrDocument. I did not add that here, because it would break the implementation in Gever, which relies on the implementation of __getattr__ from OpengeverCatalogContentListingObject, which is second in the inheritance hierarchy of OGSolrContentListingObject. This would of course be solvable, but seems to have little benefit right now.

Note that this change improved performance in Gever around 20 fold for the document listing with all columns displayed. I had a request duration of 2700ms for 200 documents, which after the change dropped to 120ms. This is because the __getattr__ of OpengeverCatalogContentListingObject would fetch the object each time a field was not found on the SolrDocument, which happened all the time for fields that are not usually set (for example author on documents or receipt date and delivery_date). Indeed as they were missing in the solr response (because empty), getting the attribute from SolrDocument would raise AttributeError, and so we would try on the object. Now as SolrDocument returns None in such cases, we will avoid fetching the object.

For https://4teamwork.atlassian.net/browse/CA-2654