Closed ilya16 closed 5 months ago
Well, we have the plan to support second. As you can see, compared to other liraries, we defined ttype
(time unit type) for supporting tick
, quarter
and second
But to support second, I need to dive into prettymidi's parsing algorithm, because I‘m not quite familiar with such conversion.
And now, our highest-priority goal is to merge the librarify branch. In this branch:
Score.hpp
into multiple .cpp and .h files, making it more readable and easier to be maintained and developedpybind11
to nanobind
. This could largely reduce the overhead of accessing small attributes, like Note.pitch
struct of array
while track.notes is a array of struct
, which means, note_arr.pitch would be a numpy array. This is great for preprocessing in python utilizing numpy.As you can see, we just get lots of things to do. I will work on the support for second
after all the above things.
@Yikai-Liao no pressure, focus on the things you are already working on first.
Just to add on the performance, the update from v0.2.3 to 0.2.5 made the read operation on my test set of 10,000 MIDIs 20% faster!
@Yikai-Liao no pressure, focus on the things you are already working on first.
Just to add on the performance, the update from v0.2.3 to 0.2.5 made the read operation on my test set of 10,000 MIDIs 20% faster!
What kind of test are you doing? Maybe midi parsing is now not the bottle neck for your test, becuase in our own test, midi parsing part got a nearly 3 times speed up (the time cost here includes reading file from disk).
Just reading 10,000 different piano-only single-track MIDIs in a for loop:
for midi_path in midi_paths:
score = Score(midi_path)
Parsing times:
Also tested on mahler.mid using timeit
:
Which is indeed 3 times faster. The platform is Linux.
I get the case.
I presume that most of your test examples are small files (we basically haven't found any files larger than mahler.mid, which is an order of magnitude larger than the average midi). The performance bottleneck should be in disk reads for small files (this is actually pretty close to 4k random reads and writes)
One possible way to further speed up this kind of situation is to use pickle
to create caches. We use zpp_bits as pickle's backend. There are generaly 2 kinds choice.
Here is an example:
Note that the file size would be a bit larger than midi, here is size the list of 1000 mahler (641KB per midi file)
Also, it occurs to me that I didn't add a constructor in python that could accept python bytes which exists in our c++ code. So it would be also possible to let the database direclty manage your raw midi data.
But in my opinion, we just don't get that large midi dataset. lmd_full.tar.gz is only 1.65GB, which means it would be easy for a server to load all data in memory.
BTW, the support for pickle also makes it possible to let symusic
support python's multiprocessing. All the objects defined in symusic is picklable, which means they can be exchanged among threads and processes. We also make sure that symusic
is a thread-safe
library.
@ilya16 I have started to implement second conversion. Although I met some bugs, I think things would be done soon.
@ilya16 I have fixed the bugs I met. Now loading midi with second as time unit is support in the librarify branch. (Convert them back to tick is not implemented now). You could try it by the following steps:
git clone --recursive https://github.com/Yikai-Liao/symusic --branch librarify
pip install ./symusic
make sure you have a cmake env.
All the 3 kinds time unit is not fully supported in librarify branch! 🎉
@Yikai-Liao thanks! trying it out right now
@Yikai-Liao found an inconsistency with times returned by miditoolkit
's midi.get_tick_to_time_mapping()
Can't catch the full case, but it seems that for some notes the start time in seconds is calculated using the tempo that comes after the note's start tick.
Here are the plotted differences between the note seconds from miditoolkit
and symusic
on one MIDI:
The tempo seconds are all the same in both cases.
Also, here's a numpy
-based time mapper I wrote to convert between ticks and seconds, adapted to symusic
's Score
. It returns the same times as miditoolkit
's midi.get_tick_to_time_mapping()
.
import numpy as np
from typing import Union
from symusic import Score
class ScoreTimeMapper:
def __init__(self, score: Score):
self.score = score
self.tempos = self.compute_tempo_times(score)
@staticmethod
def compute_tempo_times(score: Score) -> np.ndarray:
prev_tempo_tick, prev_tempo_time = 0, 0
scale_factor = 60 / float(score.ticks_per_quarter)
seconds_per_tick = scale_factor / 120.
tempo_data = []
for tempo in score.tempos:
tempo_time = prev_tempo_time + seconds_per_tick * (tempo.time - prev_tempo_tick)
seconds_per_tick = scale_factor / tempo.qpm
tempo_data.append([tempo.qpm, tempo.time, tempo_time, seconds_per_tick])
prev_tempo_tick, prev_tempo_time = tempo.time, tempo_time
tempo_data = np.stack(tempo_data, axis=1)
return tempo_data
def t2s(self, ticks: Union[int, np.ndarray]) -> Union[float, np.ndarray]:
ids = np.searchsorted(self.tempos[1], ticks, side="right") - 1
tempo_ticks, tempo_times, seconds_per_tick = self.tempos[1:4, ids]
return tempo_times + seconds_per_tick * (ticks - tempo_ticks)
def s2t(self, seconds: Union[float, np.ndarray]) -> Union[int, np.ndarray]:
ids = np.searchsorted(self.tempos[2], seconds, side="right") - 1
tempo_ticks, tempo_times, seconds_per_tick = self.tempos[1:4, ids]
return (tempo_ticks + (seconds - tempo_times) / seconds_per_tick).astype(int)
score = Score(...)
stm = ScoreTimeMapper(score)
score_note_times = stm.t2s([n.time for n in score.tracks[0].notes])
Thanks, I will test it and try to find the bug. After fixing it, we will get the 0.3.0 version
Ok, things are fixed. The bug is that I didn't take into account the case where the two tempo's are close to each other. Changing an if condition in the code to a while loop results in the correct answer.
There's also some precision loss from floating point, but I think that's acceptable
[5.60742188e-06 1.78144531e-05 1.78144531e-05 5.60742188e-06
1.17109375e-05 1.70078125e-05 7.97070311e-06 2.17343750e-05
1.26972656e-05 1.06718750e-05 5.45507811e-06 4.69843744e-05
8.03828120e-05 2.49140619e-05 9.15156245e-05 5.98359370e-05
9.62421871e-05 2.09296870e-05 6.76874995e-05 3.04687433e-06
4.68906244e-05 1.33437493e-05]
If the bug for second conversion is fixed, I'll close this issue.
The times are now correct for the same tested MIDI, but yes, some error is accumulated:
Is it only occured on midi files with dynamic tempos? I think there should be nothing wrong to be accumulated in my code except for the time (in sescond, float32) of tempo events.
Yes, it is a MIDI file with tempo changes at beat boundaries. For a single tempo, the differences are of the order of 1e-5
Here is the core of the conversion code. I guess the bug could possibly be in 2 places:
cur_factor
cause a precission loss template<template<class> class T>
[[nodiscard]] vec<T<Second>> time_vec(const vec<T<Tick>> & data) const {
vec<T<Tick>> origin(data);
ops::sort_by_time(origin); // sort it
vec<T<Second>> ans;
ans.reserve(origin.size());
auto t_iter = tempos.begin() + 1;
i32 pivot_tick = 0;
f32 pivot_time = 0;
double cur_factor = static_cast<double>(tempos[0].mspq) / 1000000. / static_cast<double>(tpq);
for(const auto & event : origin) {
while(event.time > t_iter->time) {
pivot_time += static_cast<float>(cur_factor * (t_iter->time - pivot_tick));
pivot_tick = t_iter->time;
cur_factor = static_cast<double>(t_iter->mspq) / 1000000. / static_cast<double>(tpq);
++t_iter;
}
ans.emplace_back(
pivot_time + static_cast<float>(cur_factor * (event.time - pivot_tick)),
event
);
}
return ans;
}
Maybe it's the casting from double to float for `pivot_time' and its further accumulation. Anyway, the error is very small (less than ms) to be considered an overall conversion error.
I further checked the consistency of the conversion results with pretty midi, but I found them to be quite different.
The maximum error between symusic and ScoreTimeMapper's results is on the order of 1e-5 in all test examples. But in the comparison with pretty midi I could observe even a difference of several seconds.
I don't know whose results are the GROUND TRUTH!
I use the midi files in symusic's repository.
BTW, don't use symusic with python 3.12. There are some strange bugs currently.
And here is the code:
import os
import pretty_midi as pm
import symusic as sm
import numpy as np
from symusic import Score
from typing import Union
class ScoreTimeMapper:
def __init__(self, score: Score):
self.score = score
self.tempos = self.compute_tempo_times(score)
@staticmethod
def compute_tempo_times(score: Score) -> np.ndarray:
prev_tempo_tick, prev_tempo_time = 0, 0
scale_factor = 60 / float(score.ticks_per_quarter)
seconds_per_tick = scale_factor / 120.
tempo_data = []
for tempo in score.tempos:
tempo_time = prev_tempo_time + seconds_per_tick * (tempo.time - prev_tempo_tick)
seconds_per_tick = scale_factor / tempo.qpm
tempo_data.append([tempo.qpm, tempo.time, tempo_time, seconds_per_tick])
prev_tempo_tick, prev_tempo_time = tempo.time, tempo_time
tempo_data = np.stack(tempo_data, axis=1)
return tempo_data
def t2s(self, ticks):
ids = np.searchsorted(self.tempos[1], ticks, side="right") - 1
tempo_ticks, tempo_times, seconds_per_tick = self.tempos[1:4, ids]
return tempo_times + seconds_per_tick * (ticks - tempo_ticks)
def s2t(self, seconds: Union[float, np.ndarray]) -> Union[int, np.ndarray]:
ids = np.searchsorted(self.tempos[2], seconds, side="right") - 1
tempo_ticks, tempo_times, seconds_per_tick = self.tempos[1:4, ids]
return (tempo_ticks + (seconds - tempo_times) / seconds_per_tick).astype(int)
def check(p: str):
score_t = sm.Score(p)
score_t.tracks.sort(key=lambda x: (x.program, x.name, x.note_num()), inplace=True)
score_s = score_t.to("second")
midi = pm.PrettyMIDI(p)
midi.instruments.sort(key=lambda x: (x.program, x.name, len(x.notes)))
assert len(score_t.tracks) == len(midi.instruments)
delta = []
m_delta = []
mapper = ScoreTimeMapper(score_t)
for s_track, t_track in zip(score_s.tracks, score_t.tracks):
m_delta.append(s_track.notes.numpy()['time'] - mapper.t2s(t_track.notes.numpy()["time"]))
m_ans = np.concatenate(m_delta)
for i, (s_track, p_track) in enumerate(zip(score_s.tracks, midi.instruments)):
assert s_track.note_num() == len(p_track.notes), f"{p} track {i} note equal"
for j, (s_note, p_note) in enumerate(zip(s_track.notes, p_track.notes)):
delta.append(s_note.start - p_note.start)
delta = np.array(delta)
print(p)
print(f"Tempos Num: {len(score_t.tempos)}")
print("Pretty Midi Delta", delta[delta > 1e-5])
print("TimeMapper", m_ans[m_ans > 1e-5])
if __name__ == "__main__":
root = "C:/Users/nhy/code/symusic-libra/tests/testcases/One_track_MIDIs"
for name in os.listdir(root):
if name.endswith(".mid"):
check(os.path.join(root, name))
Oh sorry, it's my fault. I forgot to sort the notes. After sorting them, pretty midi could get the same answer as ScoreTimeMapper
Changing f32 to f64 works well. If I use f64, there will be no difference with ScoreTimeMapper. But I don't think it's quite worth it. This change would make the size of a Note grow from 12 bytes to 24 bytes。
Hi! Many thanks for developing Symusic, a really great addition to the community.
I've been using miditoolkit for MIDI related processing but plan to switch to Symusic as it's really much faster. The only thing I miss now is the conversion between ticks and seconds. I need this for music transcription, alignment and synchronization tasks.
What are your plans for implementing the conversion of event times between ticks/quarters and seconds? https://github.com/Natooz/MidiTok/issues/112#issuecomment-1831297606
In miditoolkit there are
get_tick_to_time_mapping
and_get_tick_to_second_mapping
methods that create an array with indices providing a map between tick positions in MIDI to their time positions in seconds. It's very memory inefficient, but allows you to convert any tick to seconds.I propose to have in Symusic a function that accepts a time in ticks/quarters and returns the time in seconds. And the reverse function (second -> tick/quarter). Converting the whole score to/from seconds is nice, but being able to map any point in time between time domains is also a needed feature.
Possible implementation: having precomputed times in ticks and seconds for all
Score
tempos, for any arbitrary tick/second we can find the closest tempo and compute the delta shift using the tempo.