jeremymanning / hypertools

A python toolbox for gaining geometric insights into high-dimensional data
http://hypertools.readthedocs.io/en/latest/
MIT License
1 stars 1 forks source link

fix caching #3

Closed jeremymanning closed 3 years ago

jeremymanning commented 4 years ago

Caching is really useful when the goal is to re-run the same line of code multiple times (it makes subsequent runs faster). But the current implementation of caching seems to be ignoring some parameters. For example, calling plot with the reduce flag returns a DataGeometry object (as expected). However, subsequent calls to plot using the same dataset (but a different reduce argument) seem to return the same object, rather than calling reduce again with the new argument.

Ideas:

Plan:

paxtonfitzpatrick commented 4 years ago

hey @jeremymanning can you post some code to reproduce this? I can't get it to happen on my end with either the main plot function or the DataGeometry.plot method. There's definitely something fishy is going on if you're getting duplicate outputs... especially because the plot function is never actually cached.

jeremymanning commented 4 years ago

the issue seems to be related to our caching not properly handling keyword arguments. we "stringify" data and some arguments, but keyword arguments passed to functions aren't properly converted into strings, nor are complex data structures like nested dictionaries. therefore multiply calls to the same function with the same data (but different arguments) can result in the cached results being returned, effectively ignoring changed arguments.

jeremymanning commented 3 years ago

let's get rid of caching...too many problems to justify keeping this feature