Yikai-Liao / symusic

A cross platform note level midi decoding library with lightening speed, based on minimidi.
https://yikai-liao.github.io/symusic/
MIT License
108 stars 8 forks source link

Conversion between ticks and seconds #17

Closed ilya16 closed 5 months ago

ilya16 commented 6 months ago

Hi! Many thanks for developing Symusic, a really great addition to the community.

I've been using miditoolkit for MIDI related processing but plan to switch to Symusic as it's really much faster. The only thing I miss now is the conversion between ticks and seconds. I need this for music transcription, alignment and synchronization tasks.

What are your plans for implementing the conversion of event times between ticks/quarters and seconds? https://github.com/Natooz/MidiTok/issues/112#issuecomment-1831297606

In miditoolkit there are get_tick_to_time_mapping and _get_tick_to_second_mapping methods that create an array with indices providing a map between tick positions in MIDI to their time positions in seconds. It's very memory inefficient, but allows you to convert any tick to seconds.

I propose to have in Symusic a function that accepts a time in ticks/quarters and returns the time in seconds. And the reverse function (second -> tick/quarter). Converting the whole score to/from seconds is nice, but being able to map any point in time between time domains is also a needed feature.

Possible implementation: having precomputed times in ticks and seconds for all Score tempos, for any arbitrary tick/second we can find the closest tempo and compute the delta shift using the tempo.

Yikai-Liao commented 6 months ago

Well, we have the plan to support second. As you can see, compared to other liraries, we defined ttype (time unit type) for supporting tick, quarter and second

But to support second, I need to dive into prettymidi's parsing algorithm, because I‘m not quite familiar with such conversion.

And now, our highest-priority goal is to merge the librarify branch. In this branch:

As you can see, we just get lots of things to do. I will work on the support for second after all the above things.

ilya16 commented 6 months ago

@Yikai-Liao no pressure, focus on the things you are already working on first.

Just to add on the performance, the update from v0.2.3 to 0.2.5 made the read operation on my test set of 10,000 MIDIs 20% faster!

Yikai-Liao commented 6 months ago

@Yikai-Liao no pressure, focus on the things you are already working on first.

Just to add on the performance, the update from v0.2.3 to 0.2.5 made the read operation on my test set of 10,000 MIDIs 20% faster!

What kind of test are you doing? Maybe midi parsing is now not the bottle neck for your test, becuase in our own test, midi parsing part got a nearly 3 times speed up (the time cost here includes reading file from disk).

ilya16 commented 6 months ago

Just reading 10,000 different piano-only single-track MIDIs in a for loop:

for midi_path in midi_paths:
    score = Score(midi_path)

Parsing times:

Also tested on mahler.mid using timeit:

Which is indeed 3 times faster. The platform is Linux.

Yikai-Liao commented 6 months ago

I get the case.

I presume that most of your test examples are small files (we basically haven't found any files larger than mahler.mid, which is an order of magnitude larger than the average midi). The performance bottleneck should be in disk reads for small files (this is actually pretty close to 4k random reads and writes)

One possible way to further speed up this kind of situation is to use pickle to create caches. We use zpp_bits as pickle's backend. There are generaly 2 kinds choice.

Here is an example: image

Note that the file size would be a bit larger than midi, here is size the list of 1000 mahler (641KB per midi file) image

Also, it occurs to me that I didn't add a constructor in python that could accept python bytes which exists in our c++ code. So it would be also possible to let the database direclty manage your raw midi data.

But in my opinion, we just don't get that large midi dataset. lmd_full.tar.gz is only 1.65GB, which means it would be easy for a server to load all data in memory.

BTW, the support for pickle also makes it possible to let symusic support python's multiprocessing. All the objects defined in symusic is picklable, which means they can be exchanged among threads and processes. We also make sure that symusic is a thread-safe library.

Yikai-Liao commented 5 months ago

@ilya16 I have started to implement second conversion. Although I met some bugs, I think things would be done soon.

Yikai-Liao commented 5 months ago

@ilya16 I have fixed the bugs I met. Now loading midi with second as time unit is support in the librarify branch. (Convert them back to tick is not implemented now). You could try it by the following steps:

git clone --recursive https://github.com/Yikai-Liao/symusic --branch librarify
pip install ./symusic

make sure you have a cmake env.

Yikai-Liao commented 5 months ago

All the 3 kinds time unit is not fully supported in librarify branch! 🎉

ilya16 commented 5 months ago

@Yikai-Liao thanks! trying it out right now

ilya16 commented 5 months ago

@Yikai-Liao found an inconsistency with times returned by miditoolkit's midi.get_tick_to_time_mapping()

Can't catch the full case, but it seems that for some notes the start time in seconds is calculated using the tempo that comes after the note's start tick.

Here are the plotted differences between the note seconds from miditoolkit and symusic on one MIDI: image

The tempo seconds are all the same in both cases.

ilya16 commented 5 months ago

Also, here's a numpy-based time mapper I wrote to convert between ticks and seconds, adapted to symusic's Score. It returns the same times as miditoolkit's midi.get_tick_to_time_mapping().

import numpy as np
from typing import Union
from symusic import Score

class ScoreTimeMapper:
    def __init__(self, score: Score):
        self.score = score
        self.tempos = self.compute_tempo_times(score)

    @staticmethod
    def compute_tempo_times(score: Score) -> np.ndarray:
        prev_tempo_tick, prev_tempo_time = 0, 0
        scale_factor = 60 / float(score.ticks_per_quarter)
        seconds_per_tick = scale_factor / 120.

        tempo_data = []
        for tempo in score.tempos:
            tempo_time = prev_tempo_time + seconds_per_tick * (tempo.time - prev_tempo_tick)

            seconds_per_tick = scale_factor / tempo.qpm
            tempo_data.append([tempo.qpm, tempo.time, tempo_time, seconds_per_tick])
            prev_tempo_tick, prev_tempo_time = tempo.time, tempo_time

        tempo_data = np.stack(tempo_data, axis=1)
        return tempo_data

    def t2s(self, ticks: Union[int, np.ndarray]) -> Union[float, np.ndarray]:
        ids = np.searchsorted(self.tempos[1], ticks, side="right") - 1
        tempo_ticks, tempo_times, seconds_per_tick = self.tempos[1:4, ids]
        return tempo_times + seconds_per_tick * (ticks - tempo_ticks)

    def s2t(self, seconds: Union[float, np.ndarray]) -> Union[int, np.ndarray]:
        ids = np.searchsorted(self.tempos[2], seconds, side="right") - 1
        tempo_ticks, tempo_times, seconds_per_tick = self.tempos[1:4, ids]
        return (tempo_ticks + (seconds - tempo_times) / seconds_per_tick).astype(int)

score = Score(...)
stm = ScoreTimeMapper(score)
score_note_times = stm.t2s([n.time for n in score.tracks[0].notes])
Yikai-Liao commented 5 months ago

Thanks, I will test it and try to find the bug. After fixing it, we will get the 0.3.0 version

Yikai-Liao commented 5 months ago

Ok, things are fixed. The bug is that I didn't take into account the case where the two tempo's are close to each other. Changing an if condition in the code to a while loop results in the correct answer.

There's also some precision loss from floating point, but I think that's acceptable

[5.60742188e-06 1.78144531e-05 1.78144531e-05 5.60742188e-06
 1.17109375e-05 1.70078125e-05 7.97070311e-06 2.17343750e-05
 1.26972656e-05 1.06718750e-05 5.45507811e-06 4.69843744e-05
 8.03828120e-05 2.49140619e-05 9.15156245e-05 5.98359370e-05
 9.62421871e-05 2.09296870e-05 6.76874995e-05 3.04687433e-06
 4.68906244e-05 1.33437493e-05]
Yikai-Liao commented 5 months ago

19 v0.3.0 is now released, and we add full support for second in this version.

If the bug for second conversion is fixed, I'll close this issue.

ilya16 commented 5 months ago

The times are now correct for the same tested MIDI, but yes, some error is accumulated: image

Yikai-Liao commented 5 months ago

Is it only occured on midi files with dynamic tempos? I think there should be nothing wrong to be accumulated in my code except for the time (in sescond, float32) of tempo events.

ilya16 commented 5 months ago

Yes, it is a MIDI file with tempo changes at beat boundaries. For a single tempo, the differences are of the order of 1e-5

Yikai-Liao commented 5 months ago

Here is the core of the conversion code. I guess the bug could possibly be in 2 places:

    template<template<class> class T>
    [[nodiscard]] vec<T<Second>> time_vec(const vec<T<Tick>> & data) const {
        vec<T<Tick>> origin(data);
        ops::sort_by_time(origin);  // sort it
        vec<T<Second>> ans;
        ans.reserve(origin.size());
        auto t_iter = tempos.begin() + 1;
        i32 pivot_tick = 0;
        f32 pivot_time = 0;
        double cur_factor = static_cast<double>(tempos[0].mspq) / 1000000. / static_cast<double>(tpq);
        for(const auto & event : origin) {
            while(event.time > t_iter->time) {
                pivot_time += static_cast<float>(cur_factor * (t_iter->time - pivot_tick));
                pivot_tick = t_iter->time;
                cur_factor = static_cast<double>(t_iter->mspq) / 1000000. / static_cast<double>(tpq);
                ++t_iter;
            }
            ans.emplace_back(
                pivot_time + static_cast<float>(cur_factor * (event.time - pivot_tick)),
                event
            );
        }
        return ans;
    }
ilya16 commented 5 months ago

Maybe it's the casting from double to float for `pivot_time' and its further accumulation. Anyway, the error is very small (less than ms) to be considered an overall conversion error.

Yikai-Liao commented 5 months ago

I further checked the consistency of the conversion results with pretty midi, but I found them to be quite different.

The maximum error between symusic and ScoreTimeMapper's results is on the order of 1e-5 in all test examples. But in the comparison with pretty midi I could observe even a difference of several seconds.

I don't know whose results are the GROUND TRUTH!

I use the midi files in symusic's repository.

BTW, don't use symusic with python 3.12. There are some strange bugs currently.

And here is the code:

import os

import pretty_midi as pm
import symusic as sm
import numpy as np
from symusic import Score
from typing import Union

class ScoreTimeMapper:
    def __init__(self, score: Score):
        self.score = score
        self.tempos = self.compute_tempo_times(score)

    @staticmethod
    def compute_tempo_times(score: Score) -> np.ndarray:
        prev_tempo_tick, prev_tempo_time = 0, 0
        scale_factor = 60 / float(score.ticks_per_quarter)
        seconds_per_tick = scale_factor / 120.

        tempo_data = []
        for tempo in score.tempos:
            tempo_time = prev_tempo_time + seconds_per_tick * (tempo.time - prev_tempo_tick)

            seconds_per_tick = scale_factor / tempo.qpm
            tempo_data.append([tempo.qpm, tempo.time, tempo_time, seconds_per_tick])
            prev_tempo_tick, prev_tempo_time = tempo.time, tempo_time

        tempo_data = np.stack(tempo_data, axis=1)
        return tempo_data

    def t2s(self, ticks):
        ids = np.searchsorted(self.tempos[1], ticks, side="right") - 1
        tempo_ticks, tempo_times, seconds_per_tick = self.tempos[1:4, ids]
        return tempo_times + seconds_per_tick * (ticks - tempo_ticks)

    def s2t(self, seconds: Union[float, np.ndarray]) -> Union[int, np.ndarray]:
        ids = np.searchsorted(self.tempos[2], seconds, side="right") - 1
        tempo_ticks, tempo_times, seconds_per_tick = self.tempos[1:4, ids]
        return (tempo_ticks + (seconds - tempo_times) / seconds_per_tick).astype(int)

def check(p: str):
    score_t = sm.Score(p)
    score_t.tracks.sort(key=lambda x: (x.program, x.name, x.note_num()), inplace=True)
    score_s = score_t.to("second")
    midi = pm.PrettyMIDI(p)
    midi.instruments.sort(key=lambda x: (x.program, x.name, len(x.notes)))
    assert len(score_t.tracks) == len(midi.instruments)
    delta = []
    m_delta = []
    mapper = ScoreTimeMapper(score_t)
    for s_track, t_track in zip(score_s.tracks, score_t.tracks):
        m_delta.append(s_track.notes.numpy()['time'] - mapper.t2s(t_track.notes.numpy()["time"]))
    m_ans = np.concatenate(m_delta)

    for i, (s_track, p_track) in enumerate(zip(score_s.tracks, midi.instruments)):
        assert s_track.note_num() == len(p_track.notes), f"{p} track {i} note equal"
        for j, (s_note, p_note) in enumerate(zip(s_track.notes, p_track.notes)):
            delta.append(s_note.start - p_note.start)

    delta = np.array(delta)
    print(p)
    print(f"Tempos Num: {len(score_t.tempos)}")
    print("Pretty Midi Delta", delta[delta > 1e-5])
    print("TimeMapper", m_ans[m_ans > 1e-5])

if __name__ == "__main__":
    root = "C:/Users/nhy/code/symusic-libra/tests/testcases/One_track_MIDIs"
    for name in os.listdir(root):
        if name.endswith(".mid"):
            check(os.path.join(root, name))
Yikai-Liao commented 5 months ago

Oh sorry, it's my fault. I forgot to sort the notes. After sorting them, pretty midi could get the same answer as ScoreTimeMapper

Yikai-Liao commented 5 months ago

Changing f32 to f64 works well. If I use f64, there will be no difference with ScoreTimeMapper. But I don't think it's quite worth it. This change would make the size of a Note grow from 12 bytes to 24 bytes。