lugensa / scorched

Sunburnt offspring solr client
MIT License
27 stars 19 forks source link

Multi-valued date fields cannot be indexed #38

Closed mlissner closed 7 years ago

mlissner commented 7 years ago

When you try to index an item with a multi-valued date field, you run into this error:

In [14]: sun.add(judy.as_search_dict())
---------------------------------------------------------------------------
SolrError                                 Traceback (most recent call last)
<ipython-input-14-c11bdcf59b84> in <module>()
----> 1 sun.add(judy.as_search_dict())

/home/mlissner/.virtualenvs/courtlistener/local/lib/python2.7/site-packages/scorched/connection.py in add(self, docs, chunk, **kwargs)
    343         ret = []
    344         for doc_chunk in grouper(docs, chunk):
--> 345             update_message = json.dumps(self._prepare_docs(doc_chunk))
    346             ret.append(scorched.response.SolrUpdateResponse.from_json(
    347                 self.conn.update(update_message, **kwargs)))

/home/mlissner/.virtualenvs/courtlistener/local/lib/python2.7/site-packages/scorched/connection.py in _prepare_docs(self, docs)
    319                     continue
    320                 if scorched.dates.is_datetime_field(name, self._datefields):
--> 321                     value = str(scorched.dates.solr_date(value))
    322                 new_doc[name] = value
    323             prepared_docs.append(new_doc)

/home/mlissner/.virtualenvs/courtlistener/local/lib/python2.7/site-packages/scorched/dates.py in __init__(self, v)
     93         else:
     94             raise scorched.exc.SolrError(
---> 95                 "Cannot initialize solr_date from %s object" % type(v))
     96 
     97     @staticmethod

SolrError: Cannot initialize solr_date from <type 'list'> object

This appears to be because of the code here, which assumes that date fields are never multi-value:

def _prepare_docs(self, docs):
    prepared_docs = []
    for doc in docs:
        new_doc = {}
        for name, value in list(doc.items()):
            # XXX remove all None fields this is needed for adding date
            # fields
            if value is None:
                continue
            if scorched.dates.is_datetime_field(name, self._datefields):
                # This is where the code needs a tweak, I'd say:
                value = str(scorched.dates.solr_date(value))
            new_doc[name] = value
        prepared_docs.append(new_doc)
return prepared_docs

I can think of two solutions here. We can either interrogate the schema to see if the item is multi-valued, and to assume a list in that case, or we can see if we got a list, and to assume that means it's a multi-value field.

I'd be happy to implement either solution, if desired.

mlissner commented 7 years ago

I went ahead and implemented the "check-if-iterable" approach in #39. Is it possible to get a small release with this fix?

delijati commented 7 years ago

Done