Closed wouterbeek closed 2 years ago
This must be due to the use of attr
with cachetools
. Removing attr.s
from the SPARQLConnector
and Connector
classes should fix the issue.
This seems to be indeed related to caching... (the error lowest on the stacktrace comes from site-packages/cachetools/keys.py
.
This error is typically thrown in Python when trying to use an unhashable object as a key of a dictionary.
>>> x = [5,3,2]
>>> d = {}
>>> d[x] = 5
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
However, SPARQLConnector
should never be a key in a cache AFAIK? This is something we should perhaps further investigate...
EDIT: I will try to reproduce here locally later, but the fact that the PyPI install works but git clone doesn't probably means we introduced a bug in one of the latest pushes.
Cachemethod is probably defined on a function with a self attribute, so self (here the sparql connector) will be cached as well.
Which throws an error because it doesn't now how to hash it
Ok indeed: https://github.com/IBCNServices/pyRDF2Vec/blob/master/pyrdf2vec/connectors.py#L123
But this isn't newly introduced? How come this suddenly bugs out? Is this due to the upgrade from cachetools 4.2.2 to 5.0.0 (#74)? A solution would be to remove cachetools and just keep our own cache (a simple dictionary should suffice).
The original idea was to let the user define the cache policy...
Maybe simply defining how to hash the connector object could also resolve this issue.
Then we will need to fix the breaking change that is mentioned in #74 (see my screenshot). All unit tests are actually failing as well with this type of errors, so it's definitely because of this.
https://github.com/IBCNServices/pyRDF2Vec/runs/6229466856?check_suite_focus=true
The issue is also present with the KG
class as we cache the hops with the _get_hops()
function.
I guess an easy fix for now is to revert the upgrade and put cache tools back at version 4.2.2.
I did a reset and push -f to reset our repo back to the latest working commits. All commits after that point in time did get removed however so best store a copy locally if you want to salvage anything from that.
I tried to set the fetch
and _get_hops
functions to global (to avoid the object hash conflict), while using cached, but the error persists. As soon as I delete the cache, then everything works again.
Perhaps try the fix suggested by cachetools themselves?
The default key function has been changed to ignore its first argument, so this should only affect applications using custom key functions with the @cachedmethod decorator.
fetch and _get_hops both use a custom key function, we should pass None or something as first argument.
I've already tried, but the error is always the same. I will look into fixing this as soon as possible and also update the attr
syntax which has changed since then and is easier to use.
Last comment for this issue, cachetools
5.0.0 works if we simply remove the key:
@cachedmethod(operator.attrgetter("cache"), key=partial(hashkey, "fetch")))
def foo(self, bar):
pass
by:
@cachedmethod(operator.attrgetter("cache"))
def foo(self, bar):
pass
I double check that there are no more problems with cachetools
and I push this fix.
@rememberYou any update on this fix perhaps (no urgency though, so take your time!)
The commit d039a702c9b072983b5f60ba7892b8d7a2cf1993 on the develop
branch has fixed the issue with cachetool
. If you clone to the master
, since you did a git reset
with an older version of cachetool
, it should work too.
@wouterbeek Does it work on your side?
Using the develop
branch, I'm able to extract the walk from the TriplyDB endpoint without issues. I haven't test with the master
branch.
@rememberYou Running pip install .
on the last commit in master indeed works for me. Thanks for fixing this!
🐛 Bug
When I build this project locally (with
pip install .
), I get the following error message:I do not get this error message when I install this project remotely (with
pip install pyRDF2Vec
).Expected Behavior
No error, regardless of whether I build locally or install from a remote release.
Current Behavior
The above shared error message.
Steps to Reproduce
pip install .
python main.py
with the following content in filemain.py
:Environment
Additional remarks
I see the same error message in #64, but the problem description there is very different (use of a specific IDE IIUC).