audeering / audb

Manage audio and video databases
https://audeering.github.io/audb/
Other
23 stars 1 forks source link

Error message misleading for combined media + tables usage #209

Open hagenw opened 2 years ago

hagenw commented 2 years ago

If you request a media file that is not part of a selected table in audb.load() you get a misleading error message:

Traceback (most recent call last):                                                                  
  File "audb-bug.py", line 39, in <module>
    audb.load(
  File "/home/audeering.local/hwierstorf/git/audeering/audb/audb/core/load.py", line 913, in load
    requested_media = filter_media(db.files, media, name, version)
  File "/home/audeering.local/hwierstorf/git/audeering/audb/audb/core/load.py", line 705, in filter_media
    raise ValueError(msg)
ValueError: Could not find the media file 'b.wav' in mydb v1.0.0

The error message states that the media file is not inside the database mydb, but the file is inside.

There are two possible solutions:


Minimal working example to reproduce the error:

import numpy as np

import audb
import audeer
import audformat
import audiofile

DB_ROOT = audeer.mkdir('~/tmp/db')
REPO_DIR = audeer.mkdir('~/tmp/repo')
NAME = 'mydb'
VERSION = '1.0.0'

# Create database with 2 tables and 2 files
db = audformat.Database(NAME)
db.schemes['column'] = audformat.Scheme('str')
db['a'] = audformat.Table(audformat.filewise_index(['a.wav']))
db['a']['column'] = audformat.Column(scheme_id='column')
db['a']['column'].set(['a'])
db['b'] = audformat.Table(audformat.filewise_index(['b.wav']))
db['b']['column'] = audformat.Column(scheme_id='column')
db['b']['column'].set(['b'])
sampling_rate = 8000
for table in list(db.tables):
    for file in db[table].files:
        audiofile.write(
            audeer.path(DB_ROOT, file),
            np.zeros((1, sampling_rate)),
            sampling_rate,
        )
db.save(DB_ROOT)

# Publish database
repository = audb.Repository('tmp-repo', '.', 'file-system')
audb.config.REPOSITORIES = [repository]
audb.publish(DB_ROOT, VERSION, repository)

# Load database with non-overlapping table and media files
audb.load(
    NAME,
    version=VERSION,
    tables=['a'],
    media=['b.wav'],
    verbose=False,
)
frankenjoe commented 2 years ago

Extending the error message sounds like the proper solution to me.