Closed rualark closed 5 years ago
Strings, woodwinds and brass instruments can start and finish sounding smoothly or abruptly. Percussion instruments usually start abruptly, but can in some situations start smoothly (tremolo crescendo).
Considering smooth starts, it is ok if only one piece of note is audible, while others are covered with other voices. As for abrupt starts, they have to be audible always, when they are the loudest parts of the note.
From these rules, it is obvious that for each note its loudest part should be audible. This means that loudest part of note should not be covered by other voices, so that other voices are not louder significantly.
A possible approach to this is to get loudest peak for each note, then average all loudest peaks and compare average peak between voices.
This approach does not consider horizontal placement of peaks and their interaction. Their absolute placement is not important, but comparing peaks, that are very far from each other (in terms of time) does not make sense. On the other hand, we do not plan to change volume dynamically, because this would mean movement of a musician on stage. We need only one static volume, in which case average peak makes sense.
Each instrument has a perceivable loudness. In an ideal mix, each instrument should have its loudness at a particular level. For example, all string instruments should have their average peak at the same level, but snare drum should have its average peak at -6 dB compared to strings.
Detecting only one peak per note has a risk of detecting some overshoots, which do not have much musical meaning. But for good virtual instruments, there are no such overshoots.
Solution:
This solution is not viable, because it does not take into consideration that some instruments may be playing mostly in quiet sections.
That is why initial ProcessDyn solution seems more accurate. Probably, it can be fixed by migration to dB, but this seems not very important because manual solution is more stable and natural.