whoosh-community / whoosh

Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python.
Other
252 stars 37 forks source link

<type 'exceptions.AttributeError'>: 'DATETIME' object has no attribute 'numtype' #359

Open fortable1999 opened 11 years ago

fortable1999 commented 11 years ago

Original report by Carlos Sanchez (Bitbucket: papachoco, GitHub: papachoco).


in the NUMERIC class, the setstate method calls _min_max which refers to the field numtype which has not been loaded at the time. This does not allow loading of indices created with version 2.5.2 or earlier

I am attaching a index created w/ version 2.5.2 along w/ a definition of a custom field (VIDEO_TIMESTAMP) namespace (nti.contentsearch)

def setstate(self, d): self.dict.update(d) self._struct = struct.Struct(">" + self.sortable_typecode) if "min_value" not in d: d["min_value"], d["max_value"] = self._min_max()

fortable1999 commented 11 years ago

Original comment by Carlos Sanchez (Bitbucket: papachoco, GitHub: papachoco).


I am surprised to that those two fields are not in pickled in the schema. Well I have a workaround Thanks for all the effort. I am going to recreate my indices with 2.5.3

fortable1999 commented 11 years ago

Original comment by Matt Chaput (Bitbucket: mchaput, GitHub: mchaput).


I got the index to load (using whoosh.util.loading.RenamingUnpickler to redirect nti.contentsearch._whoosh_schemas.VIDEO_TIMESTAMP to a copy of the code you gave above) and it worked for me :(

fortable1999 commented 11 years ago

Original comment by Matt Chaput (Bitbucket: mchaput, GitHub: mchaput).


The weird thing is I can see the numtype in the pickle string. I'm not sure what's going on here. I'll work on faking on your environment to try to make the pickle loadable so I can reproduce the error.

fortable1999 commented 11 years ago

Original comment by Carlos Sanchez (Bitbucket: papachoco, GitHub: papachoco).


The class in question is defined as

class VIDEO_TIMESTAMP(fields.DATETIME):

def _parse_datestring(self, qstring):
    result = videotimestamp_to_datetime(qstring)
    return result

def __setstate__(self, d):
    # d['bits'] = 64
    # d['numtype'] = int
    super(VIDEO_TIMESTAMP, self).__setstate__(d)

and the schema is

fields.Schema(containerId=fields.ID(stored=True, unique=False), videoId=fields.ID(stored=True, unique=False), language=fields.ID(stored=True, unique=False), keywords=fields.KEYWORD(stored=True), start_timestamp=VIDEO_TIMESTAMP(stored=True), end_timestamp=VIDEO_TIMESTAMP(stored=True), last_modified=fields.DATETIME(stored=True))

As you can see I worked around the issue by simply setting the bit and numtype fields. The index was created using 2.5.2 and read in by 2.5.3

fortable1999 commented 11 years ago

Original comment by Carlos Sanchez (Bitbucket: papachoco, GitHub: papachoco).


This is the exception I am getting

File "/Users/csanchez/VirtualEnvs/nti.dataserver/lib/python2.7/site-packages/Whoosh-2.5.3-py2.7.egg/whoosh/index.py", line 136, in exists_in ix = open_dir(dirname, indexname=indexname) File "/Users/csanchez/VirtualEnvs/nti.dataserver/lib/python2.7/site-packages/Whoosh-2.5.3-py2.7.egg/whoosh/index.py", line 123, in open_dir return FileIndex(storage, schema=schema, indexname=indexname) File "/Users/csanchez/VirtualEnvs/nti.dataserver/lib/python2.7/site-packages/Whoosh-2.5.3-py2.7.egg/whoosh/index.py", line 421, in init TOC.read(self.storage, self.indexname, schema=self._schema) File "/Users/csanchez/VirtualEnvs/nti.dataserver/lib/python2.7/site-packages/Whoosh-2.5.3-py2.7.egg/whoosh/index.py", line 646, in read schema, segments = loader(stream, gen, schema, version) File "/Users/csanchez/VirtualEnvs/nti.dataserver/lib/python2.7/site-packages/Whoosh-2.5.3-py2.7.egg/whoosh/legacy.py", line 61, in load_110_toc schema = ru.load() File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 858, in load dispatchkey File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1217, in load_build setstate(state) File "/Users/csanchez/Documents/workspace/NextThoughtPlatform/nti.dataserver/src/nti/contentsearch/_whoosh_schemas.py", line 149, in setstate super(VIDEO_TIMESTAMP, self).setstate(d) File "/Users/csanchez/VirtualEnvs/nti.dataserver/lib/python2.7/site-packages/Whoosh-2.5.3-py2.7.egg/whoosh/fields.py", line 514, in setstate d["min_value"], d["max_value"] = self._min_max() File "/Users/csanchez/VirtualEnvs/nti.dataserver/lib/python2.7/site-packages/Whoosh-2.5.3-py2.7.egg/whoosh/fields.py", line 517, in _min_max numtype = self.numtype ConfigurationExecutionError: <type 'exceptions.AttributeError'>: 'VIDEO_TIMESTAMP' object has no attribute 'numtype'

fortable1999 commented 11 years ago

Original comment by Matt Chaput (Bitbucket: mchaput, GitHub: mchaput).


Unfortunately I can't load your index because of references to external modules. However, I can't figure out how it could be numtype, since numtype should always be in the pickled object's dictionary. Could you please paste in the error and traceback you're getting?

As a workaround, since you have the schema available as a function, you can pass when you open the index:

#!python

ix = index.open_dir("index", "vtrans_CLC3403_LawAndJustice", schema=schema)

(Or in Storage.open_index() if that's how you're opening the index.) The code will skip the pickled schema and use the one you passed in.