Closed maxschmitt closed 1 month ago
I have the same problem with Python 3.8 and versions 1.7.1 and 1.7.2
Traceback (most recent call last):
File "/home/user/projects/mydb/3.3.0/publish.py", line 11, in <module>
audb.publish(
File "/home/user/projects/mydb/.venv/lib/python3.8/site-packages/audb/core/publish.py", line 698, in publish
raise RuntimeError(
RuntimeError: You want to depend on '3.2.0' of mydb, but the dependency file 'db.parquet' in ./build does not match the dependency file for the requested version in the repository. Did you forgot to call 'audb.load_to(./build, mydb, version='3.2.0') or modified the file manually?
When I try using audb==1.6.5
I get the backend error:
File "/home/user/projects/mydb/.venv/lib/python3.8/site-packages/audbackend/core/backend/base.py", line 407, in exists
raise RuntimeError(backend_not_opened_error)
RuntimeError: Call 'Backend.open()' to establish a connection to the repository first.
Thanks for reporting, I will have a look at it.
If you want to use audb==1.6.5
as a workaround, you need to ensure you use audbackend<2.0.0
as well:
$ pip install "audb==1.6.5"
$ pip install "audbackend<2.0.0"
A minimal example to reproduce this error:
create.py
import audb
import audeer
build_dir = "./build"
audeer.rmdir(build_dir)
audeer.mkdir(build_dir)
db = audb.load_to(build_dir, "emodb", version="1.4.1", only_metadata=True)
db.description = "new"
db.save(build_dir)
publish.py
import audb
import audeer
build_dir = audeer.path("./build")
repo = "repo"
host = audeer.path("./host")
audeer.rmdir(host)
audeer.mkdir(host, repo)
repository = audb.Repository(repo, host, "file-system")
audb.publish(build_dir, "1.5.0", repository, previous_version="1.4.1")
Then we get:
$ python create.py
$ python publish.py
Traceback (most recent call last):
File "/home/hwierstorf/tmp/audb-update-bug/publish.py", line 13, in <module>
audb.publish(build_dir, "1.5.0", repository, previous_version="1.4.1")
File "/home/hwierstorf/.envs/audb-update-bug/lib/python3.10/site-packages/audb/core/publish.py", line 698, in publish
raise RuntimeError(
RuntimeError: You want to depend on '1.4.1' of emodb, but the dependency file 'db.parquet' in /home/hwierstorf/tmp/audb-update-bug/build does not match the dependency file for the requested version in the repository. Did you forgot to call 'audb.load_to(/home/hwierstorf/tmp/audb-update-bug/build, emodb, version='1.4.1') or modified the file manually?
If I delete the cache before trying to publish the new version it works:
$ rm -rf ~/audb/emodb/1.4.1/
$ python create.py
$ python publish.py
$ tree host/repo/
host/repo/
└── emodb
└── 1.5.0
├── db.parquet
└── db.yaml
2 directories, 2 files
So, the error seems to be related to https://github.com/audeering/audb/issues/402
The problem arises from our implementation of audb.Dependencies.__eq__()
, which compares the dataframes of the dependency tables:
When loading the original dependency table of version 1.4.1
, that was stored with audb<1.7
its string dtype is string[python]
, whereas the new dependency table has string[pyarrow]
, which leads to the following results when asserting the dataframes should be equal:
AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="archive") are different
Attribute "dtype" are different
[left]: string[python]
[right]: string[pyarrow]
So, I guess we should update the implementation of audb.Dependencies.__eq__()
to ensure backward compatibility.
One solution might be to ignore the dtypes of the dataframes, e.g. changing
return self._df.equals(other._df)
to
return self._df.equals(other._df.astype(self._df.dtypes))
Maybe, a warning could be printed that the dataframes do not have the same type and the reason is probably that the previous version was made with the audb < 1.7
and then, the code ignoring dtypes is used as a fallback.
As the goal is that we should have backward compatibility with existing cache files, I think we don't need to show a warning.
The dtype of the entries in the dependency table can anyway not be directly changed by the user as the file is always be created by audb
, so the user shouldn't worry about it.
I am using Python 3.10,
audb==1.7.2
.Switching back to
audb==1.6.5
, where nodb.parquet
file is generated for the updated database made it work.The corresponding database has only
misc_table
(in case this is relevant).