Legato/Portamento transitions are imperfect, causing slight popping

Yoshimi / yoshimi

A sophisticated soft-synth originally forked from ZynAddSubFX V2.4.0 in 2009 by Alan Calvert, and still in continuous development - This is also mirrored at http://sourceforge.net/projects/yoshimi/ : Current news is at http://sourceforge.net/p/yoshimi/news/ : Our email discussion list is: http://www.freelists.org/list/yoshimi and here is our website

http://yoshimi.github.io/

Other

238 stars 39 forks source link

Legato/Portamento transitions are imperfect, causing slight popping #51

Closed Diomendius closed 4 years ago

Diomendius commented 5 years ago

Transitions between notes in Legato mode do not reliably happen smoothly. Depending on the type of synths used, the number of legato transitions chained together, and in the case of ADDSynth, oscillator phase randomness, transients are sometimes introduced at the exact point of the next note's NoteOn.

It seems to me that some aspects of note state aren't preserved when transitioning to the next note. Inconsistencies in phase at the end of one note and the start of the next are the most obvious culprit, although more complex instruments may have other problems, too.

legato_1 Looking at the waveform of a test sequence of octave jumps using "Simple Sound" set to legato mode, the first transition looks perfect. Some popping can be heard, but only due to the instant transition between frequencies. ADDSynth consistently does well on the first transition (with some exceptions when oscillator phase randomness is used).

legato_2 The second transition (back down an octave) looks nearly as good. The transition appears to happen when one note is at 0° phase, but the amplitude drops slightly for the next two peaks, so the phase is likely slightly mismatched, probably by chance.

legato_3 The next transition up an octave shows the problem clearly. The next note appears to start at nearly opposite phase, the amplitude drops for the next several peaks and a pop is audible, although masked by the sudden change in pitch.

Yoshimi appears to crossfade (or otherwise smooth) one note into the next in legato mode, but this only masks the artifacts slightly.

The problem is more audible with portamento enabled (as the popping is obvious without a sudden change in pitch), but portamento itself does not appear to be the cause of the problem.

portamento_1 The first transition in the same sequence, but with portamento enabled. Completely seamless.

portamento_2 The second transition has an obvious artifact.

portamento_3 This time, the third transition is nearly seamless, although the bottom peak just before the 1.390 marker is slightly shallower.

ADDSynth without any "phase randomness" settings enabled consistently generates smooth transitions, but only for the first consecutive transition. Yoshimi clearly works some magic to make sure the ends line up, but forgets something after the first transition. As soon as the last note is stopped, the next legato transition will be seamless again.

With the phase randomness slider in the oscillator waveform dialog set to the far right (63, maximum randomness, per harmonic) ADDSynth no longer reliably generates smooth transitions even for the first in a sequence. With the phase randomness slider set to the far left (-64, maximum randomness, shared across all harmonics), ADDSynth still generates a perfect transition for the first in a sequence, every time.

SUBSynth and PADSynth do not reliably generate smooth transitions at all. It looks like only the crossfading is done to mask the transition.

I think this could prove to be a very hairy problem to solve across the board, considering how complex instruments can be, but it should be possible. The next note should inherit every time-variant aspect of the previous note's state, including phase/position in the waveform/wavetable, modulator and LFO position.

In fact, it seems the most sensible to me to reuse the existing note object entirely rather than trying to copy over every relevant part of one note into a new one, but maybe I'm missing something due to not being familiar with the codebase.

As a side note, I hope this is the most appropriate place to report this bug. Between here, the Sourceforge tracker and the mailing list, it isn't obvious what the preferred location for bug reports is.

abrolag commented 5 years ago

Indeed a complex issue! There is no copying of note parameters when doing portamento apart from the attack part of the note's envelope. The same note is otherwise just pitch shifted, but with new envelopes. In legato mode the attack part of the envelope is bypassed which is why it sounds much smoother. However, without the attack you can't effectively use legato mode on instruments with a decaying envelope!

It appears to be this attack part that causes the problem as it resets all amplitude, frequency and filter envelopes. However if you increase this to around 10mS it masks the issue as this results in a bit of a fade between notes. Also, it seems the more complex and rich the instrument sound the less noticeable the discontinuity becomes.

There is probably even more going on that I've missed!

Either here or Sourceforge (which seems arrange bug reports a bit better) is fine. If I see something that I think might be relevant for a lot of people I'll often mention it on the mailing list myself.

abrolag commented 5 years ago

Hmm, just realised I was prioritising portamento rather than legato - which is rather different! Incidentally, I've also checked right back to Zyn. 2.2.1, and this has always been the same - which is not encouraging :(

Diomendius commented 5 years ago

Yes, I'm fairly certain the problem is not with portamento; it's just more obvious to the ear that there is a problem when the transition between pitches is gradual.

I've just done some testing and legato transitions perfectly preserve envelope and LFO position, so the problem isn't related to those, nor does anything new need to change to that behaviour.

More interestingly, I tried enabling microtonality with two different scale degrees defined to have the same pitch, and while polyphonic mode lets me overlap the two tones on top of each other and monophonic mode overlaps the release of the first with the attack of the second (so Yoshimi is definitely treating them as separate notes despite being functionally identical), legato mode doesn't have any issues smoothly transitioning from one to the next. It's as if only one note is played.

Notably, this only applies to ADDSynth. PADSynth is not clever enough and a slight pop is audible during the transition. It should be theoretically possible for PADSynth to preserve the current position in its wavetable across legato transitions (assuming that is what is at fault here) which I think would solve the problem.

SUBSynth also exhibits the problem, in addition to the stereo panning/spatialization effect jumping slightly.

I strongly suspect that a discrepancy in phase between notes is the cause. The fact that the problem doesn't occur when "transitioning" between two notes defined to be the same pitch when using ADDSynth suggests to me that the problem is not simply that the next note starts from the beginning of its oscillator(s) regardless of when the previous note ended, but rather that something is preserved across the transition whose meaning changes because the frequency of the new note is different.

It can't be purely in terms of the actual sound frequency or there would be no artifacts when portamento is enabled, but I think the change in the "true", un-bent pitch of the new note has something to do with this.

This is further supported by the way the artifacts "beat" when two notes defined in a custom scale to be a fraction of a semitone apart are toggled between rapidly with legato turned on: rapid_toggle_microtonal_legato.zip

My best guess so far is that some part of the code preserves the sample offset of the playing note but doesn't account for the difference in length of oscillators at different pitches. I'm not sure how that interacts with portamento, but it could be that portamento pitch shifts the new note back to where the old note was (relatively) at some point after the offset has already been calculated. If this is the case I think fixing the legato issue will magically fix the portamento issue as well.

Diomendius commented 5 years ago

Actually, I realise now that it's more likely it's not the oscillator offset that's preserved but rather the rolling count of samples since note-start. So, oscillator offsets and LFO offsets are just calculated modulo the period of the oscillator in question. Envelope position is simply preserved as is because it's relative to the same point in time as the previous note(s).

So, after a lot of rubber-ducking myself, I guess the answer is that currently, aside from some crossfading or something similar, legato basically does just change the base pitch of a note, keeping existing data as is.

Hopefully I'm not way off the mark here, but there seems to be a lot of evidence that at least points to where a likely fix should be aimed.

dsheeler commented 5 years ago

A little while back, in zynaddsubfx, I worked on issues with legato. You might want to take a look at some changes I made that might fix your issues. First was on Sept. 9, 2018: https://github.com/zynaddsubfx/zynaddsubfx/commit/10274e8ad053153f84bfd495c049d5fe53d3d098 to this one on Sept. 28, 2018: https://github.com/zynaddsubfx/zynaddsubfx/commit/a102e9921444ee56a6fa078164f6691265a183dd . @Diomendius, would you be willing to try your tests with a version of zyn with those changes?

abrolag commented 5 years ago

Thanks for this info. The changes look interesting.

dsheeler commented 5 years ago

No problem! Lemme know if you have any questions. I was originally trying to fix pops happening with polyphonic after touch, and then I found several other sources of pops in legato events. There are a sequence of commits between those two above dates you can see in this list: https://github.com/zynaddsubfx/zynaddsubfx/commits?author=dsheeler

Diomendius commented 5 years ago

Just tested with Zyn 3.0.5 and the artifacts are gone when toggling between microtonal frequencies, but there is still an audible pop when transitioning between octaves with portamento enabled.

As before, with ADDSynth only consecutive transitions after the first cause artifacts, for some reason.

Diomendius commented 5 years ago

I've built Yoshimi with this line modified: https://github.com/Yoshimi/yoshimi/blob/c498b357c2cb3487100d15a4fe3e5965bf516e36/src/Synth/ADnote.cpp#L92

This seems to control the length of the crossfade/smoothing between legato notes. Increasing the multiplier to 0.020f hides the popping quite well in all cases (at least with ADDSynth), but looking at the waveform, it just results in more spread out dips in amplitude where artifacts would already occur. It might be a good quick fix, but not otherwise useful or surprising.

Setting it to 0 effectively removes the smoothing entirely, making it obvious that consecutive notes are shifted in phase at the point of transition. There's a hard jump from one waveform to the next, as if a section of audio had been cut out and the ends spliced together.

With this build, it's even more obvious that the first transition (when using ADDSynth) is always seamless. The join is visibly more angular, but the second waveform is spliced in with the correct phase and the audio sounds perfect.

Moreover, I've realised that the strength of the pops when transitioning between two slightly-detuned notes isn't dependent on the time elapsed since the first note; each successive transition ratchets the phase error forward slightly.

Sending MIDI pitch wheel events also changes the note frequency immediately, but unlike with legato there is no phase error at all.

abrolag commented 5 years ago

Yes, I'm aware of that as a possibility, but it's not so straightforward. Apart from anything else, with it is designed to work quickly well within a single buffer. Slow it down too much and you'll run out of time. These days I typically work at 48k/64 frames, and can often work at 32 frames. Also, the reason pitch wheel doesn't have issues is that it averages the frequency changes over multiple buffers, is changing the overall frequency reference (not making note changes), and is comparatively slow.

Finally, as I find time, I'll be slowly working though Daniel's material first, but then I'll look more closely at this. It's on my to-do list, but there are a lot of other things going on with Yoshimi right now, in readiness for September.

Diomendius commented 5 years ago

I'm not so sure pitchwheel events average anything, actually. When I set up a single MIDI pitchwheel event to instantly shift pitch up by the maximum amount, and set that maximum in Yoshimi to be a full octave, the transitions look like this:

pitchwheel_1 pitchwheel_2 pitchwheel_3 pitchwheel_4

The change in pitch sure seems instant to me. Do you mean perhaps that using a physical MIDI controller or an automation curve in a DAW smooths out changes in pitch by virtue of sending a large sequence of MIDI pitchwheel events over time? I don't think Yoshimi or Zyn do any smoothing themselves.

However, I've done some poking around and I think I've narrowed down the problem with ADDSynth:

https://github.com/Yoshimi/yoshimi/blob/c498b357c2cb3487100d15a4fe3e5965bf516e36/src/Synth/ADnote.cpp#L2354-L2371

Changing Legato.param.freq * (Legato.param.freq / Legato.lastfreq) to just Legato.param.freq presumably makes the "CatchUp" legato stage do more or less nothing, or at least less than it did before, but more importantly the phase shifting disappears completely!

I assume this part of the code has or at least had some purpose, but I haven't noticed any obvious glitches with legato with this change. Voices with panning set to random still re-randomize when transitioning to a new note, but that isn't new.

As it turns out, that line hasn't changed (other than whitespace) since it first entered the codebase in this commit back in November 2006 in the Zyn repo:

https://github.com/zynaddsubfx/zynaddsubfx/commit/d1513405b098b4223fe56f3d4d8885efda7f306c#diff-93c405dfda034a2a3206856879bf7b0cR1216

Hopefully someone knows what "CatchUp" actually does, but it apparently has some side-effects.

abrolag commented 5 years ago

Ah! Yes, I thought you were referring to using the pitch wheel control. Interesting, however the other points stand.

I've also been puzzled myself about the catchup function, but never pursued the matter.

abrolag commented 5 years ago

On Fri, 17 May 2019 06:08:36 -0700 Daniel Sheeler notifications@github.com wrote:

No problem! Lemme know if you have any questions. I was originally trying to fix pops happening with polyphonic after touch, and then I found several other sources of pops in legato events. There are a sequence of commits between those two above dates you can see in this list: https://github.com/zynaddsubfx/zynaddsubfx/commits?author=dsheeler

Hi Daniel,

Well I finally had time to look through those entries, and grab a copy of zyn. While I knew there had been plenty of changes, I didn't realise the entire structure is now totally different! The rationale behind the changes might be useful, but the code itself not so much.

My concern about legato generally is that the transition time is far too short. It is a handful of samples, but for a smooth transition it needs to be over a fair number of waveform cycles.

To that end I did some experimenting in Audacity recording first two sinewaves an octave apart, then two complex voices (mastersynth) also an octave apart, then overlapped these as pairs and used the fade in and fade out effect within Audacity to simulate a legato fade.

I was a bit surprised that while the sinewave transition time and overlap point was quite critical, the complex one was far less so.

If you are interested I've saved the project so you could have a look at it - however, even compressed it's 19Meg so I don't know what's the best way to get it to you!

Will.

-- Will J Godfrey http://www.musically.me.uk Say you have a poem and I have a tune. Exchange them and we can both have a poem, a tune, and a song.

Diomendius commented 5 years ago

I agree that the current 5ms transition time isn't long enough to be smooth in all cases. It seems to me like the sort of thing that should have a dial somewhere, since the necessity of a longer transition time seems to depend on the sound in question. It seems a lot like the legato equivalent of attack time, so it makes sense to me that it should be customizable.

As far as the overlap point is concerned, ADDSynth seems to (initially) choose it perfectly, joining X% of the way through one waveform with X% of the way through the next, so for instruments without randomization, filters, etc. the end result is as if a single continuous waveform was simply stretched or compressed at the points where the played note changes.

The "catch-up" feature definitely causes something to go wrong with this when subsequent notes are transitioned to in legato mode.

To my ear, transitions between sine wave pitches only sound "wrong" when the phase of the waves doesn't match at the point of transition (and thus a sharp "cliff" gets introduced before any crossfading is done), even when no transition period is used at all. It could be desirable to have an even smoother transition than this, though, and using a long(er) transition time would achieve this.

There are still reasons an instrument in Zyn/Yoshimi might need a (longer) transition time even when the phase is matched/preserved though:

When a subvoice has its panning set to "random"
When individual harmonics have randomized phase
When "unison" is turned on
When SUBSynth or PADSynth are used

Many of these features could be altered to impact legato transitions less, for instance panning and harmonic phase could either be preserved or interpolated across legato transitions. In the case of interpolation the end result would likely be nearly indistinguishable from a simple crossfade, though.

abrolag commented 5 years ago

From the experimenting I've done so far, phase is only a manageable quantity for the simplest of sounds. Once you get to any sort of complexity the interactions become so confused that even zero crossing is no longer a definable quantity and can change dramatically from one cycle to the next. Also, as I said, I was quite surprised that you don't seem to hear any problems with these sounds.

abrolag commented 4 years ago

Recent updates have possibly changed the behavior of this. Maybe recheck?

Diomendius commented 4 years ago

I've done a quick recompile and test of 113be5601f6b6eda9f58a2a8bb5ff4238dbcb63a and the problem is still present. At a glance, I think 1b3a6742755581f3b14f46ca0f6e91583928eefd might be a necessary component of a complete fix, but it doesn't fix the problem on its own.

Diomendius commented 4 years ago

Alright, I've spent some more time digging at this and now that the requisite Swearing at the Code is done (for now), I bring news of a victory. I now know why this problem occurs, and I even have a fixed reimplementation of legato which, while not the most elegant, is certainly much more straightforward than the current method. I've yet to implement it for SUBnote and PADnote, but the same reasoning applies to them, too.

But first, let's look at why the existing code causes the problem. When legato mode is on, notes are created in pairs, one of which is silent, although the internal logic still runs, so the two notes stay in sync.

When a new note is played while the previous one is held, ADlegatonote() or one of its counterparts is called on both the silent and non-silent note with the same parameters; the new note's base frequency and a few others. This function checks the "silent" flag and sets the silent note to fade in with the new parameters and the non-silent note to fade out, saving the new parameters for later. This way, each time a new note is played before the old note is released, the silent and non-silent notes swap.

When a note finishes fading in, the silent flag and fading status are cleared, and nothing special happens. When a note finishes fading out, it goes into a "catch-up" state where it's meant to run at a frequency which "overshoots" that of its counterpart note, so that all the frequency-dependent parts of the note's persistent state which drifted out of sync while the note was fading out at the "old" frequency end up consistent with those of the other note once the "catch-up" period is over. After the catch-up period, the note is set to the saved parameters and remains silent until the next legato transition, if any.

It's all rather overcomplicated, but the idea is sound. The implementation, however, doesn't work:

https://github.com/Yoshimi/yoshimi/blob/8cc1a9ebc537992fb7a2e91e1f913e13f3bf6f1d/src/Synth/ADnote.cpp#L413-L424

Firstly, this code runs unconditionally, so both the silent and non-silent notes immediately get set to the new parameters. Putting the block of assignments in an if (silent) block fixes this, and makes the assumption that the two notes will drift out of sync valid in the first place.

https://github.com/Yoshimi/yoshimi/blob/8cc1a9ebc537992fb7a2e91e1f913e13f3bf6f1d/src/Synth/ADnote.cpp#L2556-L2584

Here, catchupfreq uses the wrong math. If one wave oscillates 10Hz faster or slower than another, their relative phase makes a complete rotation every 1/10 seconds, regardless of what the absolute frequencies are. This is what acoustic beating is. So, the correct math is catchupfreq = Legato.param.freq + (Legato.param.freq - Legato.lastfreq).

However, this doesn't work either for another reason:

https://github.com/Yoshimi/yoshimi/blob/8cc1a9ebc537992fb7a2e91e1f913e13f3bf6f1d/src/Synth/ADnote.cpp#L2514-L2535

As the comment suggests, the loop is pointless, although compilers can and will optimize it out of existence. The loop obscures a bigger problem though; nothing happens between iterations, so the only time the note's frequency gets used is after the entire loop finishes. If Legato.fade.length were an exact multiple of synth->sent_buffersize this wouldn't be a problem, but 5ms worth of samples is basically guaranteed not to be.

So, because this loop runs in its entirety before the synth engine runs, the actual number of samples the catch-up frequency is used for is rounded down to the nearest multiple of synth->sent_buffersize.

If these three bugs are fixed, legato should finally work as intended. But I couldn't help but think how pointlessly overengineered the whole thing is. The whole problem hinges on the need to have two note objects exactly the same as each other. So why not just copy the relevant state from one note to the other?

As it turns out, there's no reason why not. It took a lot of time and frustration to piece together the unfortunately necessary boilerplate, but once it's possible to copy-construct and/or copy-assign notes, implementing legato is much simpler. Much of the boilerplate could be avoided by replacing the C With Classes style pointer soup with RAII idioms, so instead of manually implementing copy constructors for massive classes like ADnote, the implicit copy constructor/assignment operator will work fine, rather than leading to double-frees all over the place.

Anyway, what I have locally works great for ADnote, and I plan on submitting a pull request once I've got the new method implemented for SUBnote and PADnote, however long that takes.

kramlie commented 4 years ago

On 13.03.2020 16:07, Iain King-Speir wrote:

As it turns out, there's no reason why not. It took a lot of time and frustration to piece together the unfortunately necessary boilerplate, but once it's possible to copy-construct and/or copy-assign notes, implementing legato is much simpler. Much of the boilerplate could be avoided by replacing the C With Classes style pointer soup with RAII idioms, so instead of manually implementing copy constructors for massive classes like |ADnote|, the implicit copy constructor/assignment operator will work fine, rather than leading to double-frees all over the place.

This was a very interesting read, and I'm very much in favor of this! I was actually having the same idea just recently, although for a slightly different reason: When changing parameters while playing a note (with MIDI learn), most changes will cause a rather sudden change in the sound. Some parameters, such as filters, take it into account and interpolate between the two settings, but most do not.

If we had a clean way to copy a note instance, then we could play both notes from that exact point, with the two different settings, and crossfade between them. In a sense, legato is exactly the same, only triggered by notes instead of CC messages.

But it is likely quite a bit of work, and due to many components of an ADnote being conditional, a true RAII copy constructor isn't entirely trivial.

Anyway, what I have locally works great for |ADnote|, and I plan on submitting a pull request once I've got the new method implemented for |SUBnote| and |PADnote|, however long that takes.

Looking forward to seeing how you solved it!

-- Kristian

abrolag commented 4 years ago

Yes. Some very impressive sleuthing here, and I'm keen to see this in action. If I understand the code correctly PAD note is effectively running wavetables with the envelopes/filters sitting on top, so doesn't have the same dynamic complexity. No idae of the implications for SUBnote. Incidentally, interpolation appears to be a major issue at the moment - I'm also hitting it with my experiments with Poly aftertouch. P.S. I had to read up on RAII - my C++ knowledge leaves quite a bit to be desired :(