marl / jams

A JSON Annotated Music Specification for Reproducible MIR Research
ISC License
185 stars 26 forks source link

Slow evaluation for beat tracking #175

Closed simondurand closed 7 years ago

simondurand commented 7 years ago

It seems that the beat evaluation integrated in jams is quite slow compared to more standard techniques. When I use the file example_eval.py from the jams example documentation the computational time is 10 times longer than when I use even a simple baseline. The result is of course identical.

example_eval.py:

#!/usr/bin/env python

import sys
import jams

from pprint import pprint

def compare_beats(f_ref, f_est):

    # f_ref contains the reference annotations
    j_ref = jams.load(f_ref, validate=False)

    # f_est contains the estimated annotations
    j_est = jams.load(f_est, validate=False)

    # Get the first reference beats
    beat_ref = j_ref.search(namespace='beat')[0]
    beat_est = j_est.search(namespace='beat')[0]

    # Get the scores
    return jams.eval.beat(beat_ref, beat_est)

if __name__ == '__main__':

    f_ref, f_est = sys.argv[1:]
    scores = compare_beats(f_ref, f_est)

    # Print them out
    pprint(dict(scores))

I used validate=False as without it, the process is slower. The code is then: example_eval.compare_beats(infile_1, infile_2) Where infile_1 and infile_2 are 2 jams files.

Simple baseline:

import numpy as np
import mir_eval

thefile = open(infile_1, 'r')
lines = thefile.readlines()
beats_pos_1 = np.zeros(0)
for line in lines:
    if line[:18] == '          "time": ':
        beats_pos_1 = np.append(beats_pos_1, line[18:-1])
thefile.close()
beats_pos_1 = beats_pos_1.astype('float')

thefile = open(infile_2, 'r')
lines = thefile.readlines()
beats_pos_2 = np.zeros(0)
for line in lines:
    if line[:18] == '          "time": ':
        beats_pos_2 = np.append(beats_pos_2, line[18:-1])
thefile.close()
beats_pos_2 = beats_pos_2.astype('float')
mir_eval.beat.evaluate(beats_pos_1, beats_pos_2)

I used the same jams files in both cases for comparison purposes. But in the second case, I could use a more efficient format that would make the evaluation faster.

bmcfee commented 7 years ago

What jams version are you using?

I expect that the added latency is due to forced validation after namespace conversion. We recently merged an update to the validation code that should resolve this as much as is possible. That fix is in master now, and will be included in the 0.3.1 release.

simondurand commented 7 years ago

OK. I'm using the version 0.3.0.

bmcfee commented 7 years ago

@simondurand 0.3.1 is now up on pypi, please give it a try and let me know if it's still slow.

simondurand commented 7 years ago

It is faster now. Thanks for the push.

bmcfee commented 7 years ago

Thanks for checking. I'll close this one out then.