Closed hagenw closed 1 month ago
@ChristianGeng this is funny.
I try here to test for all the errors we have seen when loading older dependency tables from cache (ModuleNotFoundError under Windows, KeyError under Python 3.8). I created PKL files from the emodb dependency table with different Python and different audb
versions and stored them under tests/assests/dependency-table/
. But funnily, I'm not able to raise any of the observed errors.
@ChristianGeng this is funny. I try here to test for all the errors we have seen when loading older dependency tables from cache (ModuleNotFoundError under Windows, KeyError under Python 3.8). I created PKL files from the emodb dependency table with different Python and different
audb
versions and stored them undertests/assests/dependency-table/
. But funnily, I'm not able to raise any of the observed errors.
I tried to investigate the error that you are getting on windows via google.
I find this thread interesting (despite both come trhough duckdb):
https://github.com/duckdb/duckdb/issues/8735 https://github.com/chroma-core/chroma/issues/1069
It basically says that
None of these however relate this to the windows platform. Still, I could imagine that this is simply some weird incompatibility thing that in the windows case that we have strikes us. I also cannot find any code locations where iaudb would import dtype directly. Also when type hinting I cannot find suspicious code locations.
So I am also very startled what is going on here.
Regarding the code location, I'm very confident that this is the affected code:
So if we simply would catch Exception
we would avoid the error.
Maybe we can change the code to not load the dependency table if a user Interrupts:
try:
deps = Dependencies()
deps.load(cached_deps_file)
except KeyboardInterrupt:
raise
except Exception:
# If loading cached file fails, load again from backend.
# As loading of not compatible pickle files
# results in a variety of possible errors,
# we except all besides keyboard interruption
backend_interface = utils.lookup_backend(name, version)
deps = download_dependencies(backend_interface, name, version, verbose)
# Store as pickle in cache
deps.save(cached_deps_file)
So if we simply would catch
Exception
we would avoid the error. Maybe we can change the code to not load the dependency table if a user Interrupts:
Probably then it will be hard to avoid a "catch-all". It is always a bit worrying ...
At least if there is really another depedency file related error, it should be raised by the other code lines as well.
I think, I'm more worried that audb
fails loading something, and the user has no clue what to do.
At least if there is really another depedency file related error, it should be raised by the other code lines as well.
I think, I'm more worried that
audb
fails loading something, and the user has no clue what to do.
Not sure whether I get it. So you are talking about the situtation when the exception is raised? But then the other code lines themselves would raise as you mention, saying tht this should be raised by the other code lines as well.
Or is your worry about many problems after a release?
But then the other code lines themselves would raise as you mention
No, they would not if the exception is raised due to a broken/non-compatible cache file, as the cache file is overwritten. The error would only be raised then, if it is related to the content of the dependency table, but not due to a broken/non-compatible cache.
As stated in https://github.com/audeering/audinterface/pull/172#issuecomment-2117517305, the actual problem regarding the failed Windows test might not be a backward compatibility issue, but an across platform compatibility issue when using the newest version of audb
on all platforms.
We should try to extend the test here with a cross-platform test using the newest version of audb
to create the cache file.
I ran into both the ModuleNotFoundError
and KeyError
when trying to load a database that already had an existing version stored, and then found this PR.
I think it isn't a cross-platform issue, but a compatibility issue with different pandas
versions.
I tried creating different versions of the db.pkl
from the db.csv
similar to what was added in https://github.com/audeering/audb/pull/413/commits/d5402d6bf814702d7d55a2fc3f416da18f7e007f , using always audb==1.7.2
but alternating the pandas
version. Then I loaded the db.pkl
with pd.read_pickle()
.
This replicated the error:
db.pkl
created with pandas==2.0.3
. Loading with pandas==2.2.2
leads to a ModuleNotFoundError
db.pkl
created with pandas==2.2.1
or pandas==2.2.2
. Loading with pandas==2.0.3
leads to a KeyError
Maybe the problem just manifested for Windows because it happened to install a different pandas
version between tests, but for the other platforms the same pandas
version was used.
Thanks for reporting this.
Would be nice if it is not a cross-platform issue. As then we can solve it by reloading the database and storing again when we experience an error.
I will try to update the tests here to replicate your findings.
Closing in favor of #417 and #418
...