hydrogen-music / hydrogen

The advanced drum machine for Linux, macOS, and Windows
http://www.hydrogen-music.org
GNU General Public License v2.0
1.03k stars 172 forks source link

Song exported from hydrogen has glitches / does not maintain a consistent tempo #857

Closed lukesanantonio closed 4 years ago

lukesanantonio commented 4 years ago

Hydrogen version * : Hydrogen 1.0.0-'dc269b98' (just built from master earlier today) Operating system + version : Ubuntu 20.04 Audio driver + version : Jack 1.9.12, but also tried pulseaudio, alsa drivers


Hello! Just wanted to say thank you for an amazing piece of software!!

I'm having an issue where a song exported from hydrogen does not maintain a consistent tempo, I've been having this issue since I upgraded to 1.0.0-beta2, but the issue still seems present in master (which I compiled earlier today).

Here's how I reproduce it:

Open a new hydrogen project, use bpm 120, add beats on every count, 1,2,3,4, etc. Set this pattern to go for 32 bars in the song. Export song as 44800 Hz wav, linear interpolation. (I've also tried ogg, flac, etc, but the issue persists)

Open audacity, import the hydrogen WAV, hit generate -> rhythm track. I'm using tempo 120 bpm, 4 beats per bar, 32 bars, which sould match the recording exactly.

Okay, so then I play the two files back. They start perfectly in sync, but every 10 bars or so the hydrogen-exported track has a strange, audible glitch which puts a slight delay in the track, resulting in out of sync beats!

I've been reading through some other issues, but can't find any that only apply to exporting the song. And I've used a temporary workaround where I click play in hydrogen at just the right time to sync with LMMS. When I'm playing the pattern directly inside hydrogen it never gets out of sync.

I also noticed these messages in the console when I run hydrogen with -Vdebug, they happen for every filetype (ogg, flac, wav) that I've tried exporting!

(I) ::void* H2Core::diskWriterDriver_thread(void*) DiskWriterDriver thread start
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 768, new size = 1024
(I) ::int H2Core::audioEngine_process(uint32_t, void*) Buffer size changed. Old size = 1024, new size = 768
(I) ::void* H2Core::diskWriterDriver_thread(void*) DiskWriterDriver thread end
(I) SongWriter::writeSong Saving song autosave.h2song
(I) SongWriter::writeSong Save was successful.

trebmuh commented 4 years ago

Related (and possibly duplicate #852).

SMLiberator commented 4 years ago

Also having the exact same problem after upgrade to 1.0.0-beta2, wasn't there in any previous versions. Running Ubuntu 18.04. There isn't any unusual clipping or distortion to the exported tracks other than very short pieces of silence were they shouldn't be, and waveforms continue as expected after the glitch.

mauser commented 4 years ago

Hi!

I can confirm that there are some general issues with the song export in the current master branch. Beside the problems which are mentioned in this issue, it seems that the UI does not behave properly when a template is selected (the current sample rate / bit depth is not getting updated).

mauser commented 4 years ago

The UI problems are now fixed.

About the behavior which is described in this issue: I wrote a quick test track, similar to your pattern, but with alternating kick/snare beats and exported it with 1.0-beta1 and 1.0-beta2. Then, both files were imported into a DAW. Up to 9.4 seconds, everything works fine. Then, my kick is beeing delayed by what seems to be one eigth note, so i get a "kick kick snare snare" pattern where there was "kick snare" before, when playing both tracks at the same time. This is the relevant part as seen in a DAW:

HydroExportDebug

I will have a deeper look at this..

lukesanantonio commented 4 years ago

Thanks @mauser let me know if I can provide any additional debugging information!

cme commented 4 years ago

Just in case it's worth anything @mauser , I had a look into this earlier and believe I have at least one potential cause, but didn't get as far as a fix. The routine audioEngine_process() tries to acquire the audioengine lock, and if it fails, it will silently return without filling the buffer or advancing the song position. The IO diskwriter will then repeat a buffer until audioEngine_process() successfully acquires the lock and continues, effectively delaying the output by a buffer's duration each time.

That behaviour was introduced as a fix for a deadlock issue and should be vanishingly rare in playback, but frequent in exporting to disk. I didn't look any further than that.

mauser commented 4 years ago

@cme : Great, that saved me a lot of work :) I just had some minutes today to look into the issue, but it seems that the lock is being aquired by PatternEditorRuler::updateEditor. I did a quick fix locally and let the audio engine try for three times to aquire the lock (when an export session is active..), which seemed to solve the export problem for me. It seems that in this situation, the PatternEditorRuler and the DiskWriterDriver are both polling the lock at a quite high frequency, which makes it problematic.

At the moment i do not see an option to remove the lock from the PatternEditorRuler. Throttling the DiskWriterDriver could also help, but it this will be unreliable, since we can't guarantee the stability. Maybe there could be a new return value for the audioEngine_process Method which signals the caller that the processing has been canceled before the data has been processed. If the caller (in this case only the DiskWriterDriver) sees this return value, the audioEngine_process can be called again with the same buffer.

cme commented 4 years ago

A return value from audioEngine_process and retry was one of my first thoughts (or an extra parameter to allow the caller to decide whether to block for the lock or return on failure).

Does the same possibility of failure to acquire the lock exist during playback to an audio device? It will be lower probability but surely possible given the GUI updates during playback also?

elpescado commented 4 years ago

I gave a quick glance over Hydrogen sources and I noticed that AudioEngine_process tries to acquire lock and gives up instantly if that's not possible. Maybe some timeout would be sufficient? Hydrogen uses pthreads under the hood, so maybe using pthread_mutex_timedlock might help?

elpescado commented 4 years ago

BTW. I have gut feeling that this mutex is acquired in too many places, but I have no science to back that up.

cme commented 4 years ago

I gave a quick glance over Hydrogen sources and I noticed that AudioEngine_process tries to acquire lock and gives up instantly if that's not possible. Maybe some timeout would be sufficient?

This seems a good idea as it allows more graceful behaviour in the case of live playback. The timeout should probably be a parameter to the callback though as it has different constraints for its different uses: for audio driver playback, an acceptable time would be proportional to the buffer length, but for the disk writer an arbitrarily long time would be acceptable.

It might be worth looking again at the original deadlock that needed this fix to see if it's really the right fix?

theGreatWhiteShark commented 4 years ago

I gave a quick glance over Hydrogen sources and I noticed that AudioEngine_process tries to acquire lock and gives up instantly if that's not possible. Maybe some timeout would be sufficient?

the audioEngine_process function only has a very limited amount of time available to perform its actions. Otherwise an XRUN will be produced. Having a timeout might be smoothly working on modern computers but, on the other hand, cause trouble on older or embedded devices. But just a thought and no measurements backing this up.

But also a general remark. As far as I understand the problem occurs only when exporting a song to audio file. And it only occurs because there is both the DiskWriterDriver and the GUI accessing the patterns. Why not dropping the fancy animation-like behavior of the GUI during export? The progress bar does already do fine job displaying the progress. A check for the driver in PatternEditorRuler::updateEditor and deciding whether or not to update the GUI might be a more resilient fix.

elpescado commented 4 years ago

That's correct. But don't we precisely know how much time is available (buffer size / sample rate), so that timeout can be adjusted accordingly? Critical section in PatternEditorRuler is not doing that much work:

    AudioEngine::get_instance()->lock( RIGHT_HERE );

    PatternList *pList = pEngine->getCurrentPatternList();
    for (uint i = 0; i < pList->size(); i++) {
        if ( m_pPattern == pList->get(i) ) {
            bActive = true;
            break;
        }
    }

    AudioEngine::get_instance()->unlock();

Why not dropping the fancy animation-like behavior of the GUI during export?

Are we sure that this might not happen outside export session?

But anyway, I think root cause is too much locking going on. Maybe ruler should be driven by events pushed to event queue instead of interacting with AudioEngine directly?

elpescado commented 4 years ago

BTW: Here's issue that triggered this issue: Deadlock in ALSA driver. It seems it was reported by me;)

cme commented 4 years ago

That's correct. But don't we precisely know how much time is available (buffer size / sample rate), so that timeout can be adjusted accordingly? Critical section in PatternEditorRuler is not doing that much work:

Absolutely agree here. Failure to acquire that lock should be only momentary. Also it's worth considering the failure modes: missing a buffer due to waiting on a lock is no worse than missing a buffer due to abandoning any attempt to get the lock. In either case, the song's position fails to move on and there will be an audible glitch. What good is a drum machine if it can't keep time? ;)

Why not dropping the fancy animation-like behavior of the GUI during export?

Are we sure that this might not happen outside export session?

I'm pretty sure it can, as the position indicator is still updated in playback. It's unlikely but it's possible so should be considered, especially since adding a time to lock acquisition will almost entirely solve the issue for the playback case.

theGreatWhiteShark commented 4 years ago

That's correct. But don't we precisely know how much time is available (buffer size / sample rate), so that timeout can be adjusted accordingly? Critical section in PatternEditorRuler is not doing that much work:

Sure, but we don't know how much time is required to complete a process cycle. When e.g. adding LADSPA effects this time increases quite a lot. Also, because there is just one central lock, there might be other parts of the GUI trying to acquire this lock as well. But don't get me wrong, I like the idea of a short timeout.

Are we sure that this might not happen outside export session?

I'm pretty sure it can, as the position indicator is still updated in playback. It's unlikely but it's possible so should be considered, especially since adding a time to lock acquisition will almost entirely solve the issue for the playback case.

I think so too.

Just for reference: the lock in PatternEditorRuler was introduced about a year ago. https://github.com/hydrogen-music/hydrogen/issues/714

Also it's worth considering the failure modes: missing a buffer due to waiting on a lock is no worse than missing a buffer due to abandoning any attempt to get the lock. In either case, the song's position fails to move on and there will be an audible glitch. What good is a drum machine if it can't keep time? ;)

In Linux the most common audio driver is JACK, which also includes transport control via a centralized server. Using this driver Hydrogen would still be properly synchronized with all other clients. Also the core concept is that you have a buffer containing audio data, which you move from one application to another. This linkage allows you to build quite powerful pipelines from rather simple UNIX-like programs. But you assume that they will all be done in a certain amount of time. Haven't read too much about this topic but it may have worse effects to not deliver a buffer in time than to deliver a glitch.

cme commented 4 years ago

That's correct. But don't we precisely know how much time is available (buffer size / sample rate), so that timeout can be adjusted accordingly? Critical section in PatternEditorRuler is not doing that much work:

Sure, but we don't know how much time is required to complete a process cycle. When e.g. adding LADSPA effects this time increases quite a lot.

The real time taken to process a buffer's worth of samples is already tracked in m_fProcessTime. At the start of each buffer's processing, it will contain the time taken to process the previous buffer. Granted this is going to change if the buffer size changes, effects and samples change etc, and will be subject to CPU scheduling noise, but it seems a reasonable first approximation to use, with some appropriate pessimistic factor, such as

double fSlackTime = ( m_fMaxProcessTime - m_fProcessTime * 2.0 ); // allow 2x last process time int nLockTimeout = (int) fSlackTime / 2;

(also hoisting the calculation of nLockTimeout to the start of the function).

Also, because there is just one central lock, there might be other parts of the GUI trying to acquire this lock as well. But don't get me wrong, I like the idea of a short timeout.

Yes, other parts update based on display and user interaction (selecting different patterns, editing) that can happen while playback is running.

If I have time in the next few days I'll implement and test this as well as profiling how long locks are held by the GUI components on a few machines (including Raspi and linux VMs, which should be interesting!)

mauser commented 4 years ago

Wow, that's definitely a constructive discussion :) Let's sum it up!

We have three major proposals:

  1. Use a timeout in audioEngine_process() when aquiring the lock

    • How to choose a correct/useful timeout?
    • Would also mitigate problems which occur during normal playback
  2. Let audioEngine_process()communicate the problem to DiskwriterDriver, so that it can re-run audioEngine_process() until it is successful

    • Solves the Export problem completely
    • Does not help with lock aquisition during normal playback
  3. Deactivate the UI code which tries do aquire the lock at a higher frequency

    • PatternEditorRuler::updateEditor and SongEditorPatternList::createBackground have to be touched
    • Does not help with lock aquisition during normal playback
mauser commented 4 years ago

In my opinion, all of those parts make sense in their combination. If (for example) only 1. is implemented, there is no guarantee that Export problem won't occur again, if we get the timing wrong. It may also occur that aquiring the lock might fail again after the timeout. Combining 1. and 3. can minimize the probability that the exported file has problems. But i feel that this might be still to weak, since maybe someone else writes a new UI functionality next year which also aquires the lock and the problem will be there again.

The combination with (2) makes the Export use case fail-safe (in my opinion, if i don't have missed a situation).

I would propose the following: Implementing (2) and (3) immediately, since this fixes the Export problem. (3) is not really necessary, but i see no risk there.. I could care for the implementation, as i have done this already locally. This would also give us the change to release rc1 or beta3 shortly (i.e. in the next days).

(1) gives me more headaches :) I have an uneasy feeling about the determination of the timeout. Wouldn't we make things even worse when the timeout is too long? A deeper look at this would be necessary, as @cme already mentioned. But it is also clear that this solutions is the most "powerful".

mauser commented 4 years ago

BTW. I have gut feeling that this mutex is acquired in too many places, but I have no science to back that up.

:smiley:

Honestly, i don't know. But doing a code review of all places where the lock is aquired should be possible. At least for the two occurances in this issue, i do not see a way without the lock (at least without touching the core code).

elpescado commented 4 years ago

Wouldn't we make things even worse when the timeout is too long?

I think it couldn't be worse. When timeout is too long and it times out, that means that the mutex was locked by someone else all the time, so there would be xrun anyway. Or am I missing something?

I have to think about this, but it might mean that timeout length is not really relevant, so we might just pick some arbitrary value. Whether we wait loo long or we time out early, we're gonna get missed frame. The real issue is that mutex is held by other parts too long.

elpescado commented 4 years ago

I might be answering my own question;) Suppose that UI locks AudioEngine mutex so that current audio frame couldn't be processed in time. There's glitch and we can't do anything about it. But what happens with the next frame? If we time out early enough, will the next frame be played correctly? If we time out for too long wouldn't we "clog" audio thread so that it misses following frames and/or looses timing?

mauser commented 4 years ago

@lukesanantonio :

The fix is now available on the following branch: https://github.com/hydrogen-music/hydrogen/tree/ExportGlitchFixes . Could try to build this branch and see if the problem is gone?

cme commented 4 years ago

@mauser I'm also working on timed locks (and have some profiling data to share) which will help the potential playback cases. I'll share and pull request shortly if you don't mind.

lukesanantonio commented 4 years ago

@mauser I'm able to export without issues!! I did a side by side comparison of 1.0.0-beta1, -beta2 and this branch, and beta1 stayed perfectly in sync with the ExportGlitchFixes branch, while -beta2 got out of sync quickly, as expected.

cme commented 4 years ago

Some data at https://github.com/cme/hydrogen/blob/timed_locks_profiling/profiling/report.md and a pull request at #878

elpescado commented 4 years ago

Hi, good job with profiling! I'm curious, what would be results for smaller buffer sizes, like 128 samples. 1024 samples seems high setting.

cme commented 4 years ago

Hi, good job with profiling! I'm curious, what would be results for smaller buffer sizes, like 128 samples. 1024 samples seems high setting.

Funnily enough, that's exactly what I was trying to do when I found #879 and #881. The Raspi with ALSA only seems to support a minimum period of 444 frames, where things are much the same. Smaller period/buffer sizes mean it's more likely to hit the wait case, but also the wait times are still very small so even reducing the allowable wait time in line with the smaller available CPU time for each period/buffer seems to allow plenty of time.

Would also be interesting to profile lock instances independently of this to see where the worst cases are.

mauser commented 4 years ago

Hi!

I'm closing this issue now, since the core of the problem is fixed and the pull requests for the locks are also already there.