frictionlessdata / datapackage-pipelines

Framework for processing data packages in pipelines of modular components.
https://frictionlessdata.io/
MIT License
119 stars 32 forks source link

Using cachetools v3.x breaks lib/join (dpp 1.7.1) #155

Open brew opened 5 years ago

brew commented 5 years ago

This is with datapackage-pipelines v1.7.1.

Unpinned cachetools is declared as a dependency in setup. And used by utilities/kvstore.py. If cachetools==3.0.0 (latest version) is used, my pipeline fails with this error:

[./denormalized_flow:T_0] >>> INFO    :join: Traceback (most recent call last):
[./denormalized_flow:T_0] >>> INFO    :join:   File "/Users/brew/virtualenvs/os-api/lib/python3.6/site-packages/datapackage_pipelines/specs/../lib/join.py", line 297, in <module>
[./denormalized_flow:T_0] >>> INFO    :join:     new_resource_iterator(resource_iterator))
[./denormalized_flow:T_0] >>> INFO    :join:   File "/Users/brew/virtualenvs/os-api/lib/python3.6/site-packages/datapackage_pipelines/wrapper/wrapper.py", line 64, in spew
[./denormalized_flow:T_0] >>> INFO    :join:     for res in resources_iterator:
[./denormalized_flow:T_0] >>> INFO    :join:   File "/Users/brew/virtualenvs/os-api/lib/python3.6/site-packages/datapackage_pipelines/specs/../lib/join.py", line 213, in new_resource_iterator
[./denormalized_flow:T_0] >>> INFO    :join:     collections.deque(indexer(resource), maxlen=0)
[./denormalized_flow:T_0] >>> INFO    :join:   File "/Users/brew/virtualenvs/os-api/lib/python3.6/site-packages/datapackage_pipelines/specs/../lib/join.py", line 168, in indexer
[./denormalized_flow:T_0] >>> INFO    :join:     db[key] = current
[./denormalized_flow:T_0] >>> INFO    :join:   File "/Users/brew/virtualenvs/os-api/lib/python3.6/site-packages/cachetools/lru.py", line 21, in __setitem__
[./denormalized_flow:T_0] >>> INFO    :join:     cache_setitem(self, key, value)
[./denormalized_flow:T_0] >>> INFO    :join:   File "/Users/brew/virtualenvs/os-api/lib/python3.6/site-packages/cachetools/cache.py", line 47, in __setitem__
[./denormalized_flow:T_0] >>> INFO    :join:     size = self.getsizeof(value)
[./denormalized_flow:T_0] >>> INFO    :join:   File "/Users/brew/virtualenvs/os-api/lib/python3.6/site-packages/datapackage_pipelines/utilities/kvstore.py", line 100, in _dbget
[./denormalized_flow:T_0] >>> INFO    :join:     value = self.db.get(key)
[./denormalized_flow:T_0] >>> INFO    :join:   File "/Users/brew/virtualenvs/os-api/lib/python3.6/site-packages/datapackage_pipelines/utilities/kvstore.py", line 65, in get
[./denormalized_flow:T_0] >>> INFO    :join:     ret = self.db.get(key.encode('utf8'))
[./denormalized_flow:T_0] >>> INFO    :join: AttributeError: 'dict' object has no attribute 'encode'

If I explicitly install an older version (cachetools==2.1.0), the pipeline runs successfully.