larsoner commented 3 years ago

Add SOUND_CARD_TRIGGER_ID_AFTER_ONSET config var to allow stamping the TTL IDs after the 1 trigger so that a single sound can be played. Should help with rtmixer/portaudio fragility when sounds are stopped and started in close proximity.
Change stamp_triggers for the sound controller to return as quickly as possible when wait_for_last=False. We had made it wait until all but the last trigger was stamped to be consistent with TDT and parallel triggers. It seems better not to do this if we don't need to -- it decreases the uniformity of our approach across platforms, but makes the sound card platform (probably the way of the future) work better.
Fix wait_secs when not used in an ec context. Probably doesn't matter but I noticed this bug was introduced in #399.

codecov[bot] commented 3 years ago

Codecov Report

Merging #419 (8baf53a) into main (d971681) will increase coverage by 1.35%. The diff coverage is 78.78%.

@@            Coverage Diff             @@
##             main     #419      +/-   ##
==========================================
+ Coverage   86.85%   88.20%   +1.35%     
==========================================
  Files          50       50              
  Lines        6673     6683      +10     
  Branches     1097     1099       +2     
==========================================
+ Hits         5796     5895      +99     
+ Misses        621      530      -91     
- Partials      256      258       +2

rkmaddox commented 3 years ago

We've used this in a couple experiments now. It seems to be working very well.

We are having a couple weird audio issues that started around the time we began using this, but we don't think are related:

While waiting for a trial to end using wait_secs it usually is fine, but every now and then leads to a very long delay (can be up to 10 s) after playing a 1 s stimulus.
Rarely, the audio quality will start to sound stuttery (including trigger dropouts), in a way that feels like missed buffers. But the log file shows no missed buffers and the RME drivers seem to think there haven't been any either. We are not using a particularly aggressive buffer size (96 samples at 48k).

Both of these problems are very intermittent and not likely related, but seem worth mentioning since they are happening in scripts using this branch.

larsoner commented 3 years ago

The wait_secs sounds like the Python interpreter gets caught in some other thread and cannot return. The stuttery sound is weird since I think python-rtmixer does not need the GIL and so its thread should have no problem pumping the NumPy data into the sound card buffers on time.

I don't know where we would start to debug these, which is not great since they seem like pressing problems. Maybe we can make some minimal example using plain python-rtmixer and open an issue there -- but if it only happens sometimes, and only after a long time, that's not a great test case.

Maybe a first attempt at MWE is to absolutely kill your CPU with some other processes (is SETI@home still a thing?) while running an experiment to see if it easily replicates, then we can work on whittling it down...

rkmaddox commented 3 years ago

Pegging the CPU with other stuff is a good idea. It just doesn't make much sense. We give our process high priority in win, and CPU and RAM are both not even close to fully utilized when this happens, though it could be a single core getting hit for some reason.

rkmaddox commented 3 years ago

Using ec.wait_secs. We are going to try it with the time.sleep commented out. It sure doesn't seem like that would matter, but if the problems we're seeing are both from wait_secs, which would make sense since that is what we are using to wait while a stimulus plays, then the addition of that sleep command is the only thing that changed.

tomstoll commented 3 years ago

@rkmaddox and I are pretty confident the issues we mentioned previously are not from changes in this branch. We've reverted the changes to wait_secs and the issues persisted. The stuttering issue typically happens at most once per experiment (experiments are 2.5 hours long).

Bombing the CPU didn't seem to do anything, although I'm not sure the best way to test it. I ran the batch file below a bunch of times simultaneously (at least as many times as there are cores), but the CPU usage always hovered around 90%. Setting the python script to below normal priority made no difference.

@echo off
:loop
goto loop

larsoner commented 3 years ago

The stuttering issue typically happens at most once per experiment (experiments are 2.5 hours long).

One more thing I can think of is, during some point you're going to wait anyway (ISI maybe?) try adding a:

import gc, time
t0 = time.time()
gc.collect()
print(f'Garbage collected in {(time.time() - t0) * 1000:0.1f} ms')

This will force garbage collection. Python garbage collects at unknown times, and maybe at some point you're getting hit by a garbage collection that takes a really long time for some reason. I have no idea why this would hang -- and python-rtmixer shouldn't break when this happens anyway -- but it's worth a shot anyway.

Also, have you checked the logs to see when the hangup occurs? In particular I would read_tab and look at the info there somehow to figure out when there was a problem, and once you have that, manually look at the .tab and .log files to see where the slowdown is. If it's not clear how to do this, if you upload a .tab/.log pair from a problematic experiment I can look and see if I can find anything.

Ultimately, though, I think this is probably some python-rtmixer bug. I wonder if we're hitting some variant of https://github.com/spatialaudio/python-rtmixer/issues/9#issuecomment-489175489. But the problem is that it's not deterministic and takes a long time to hit, so it's terrible to debug!

larsoner commented 3 years ago

... in the meantime I'm inclined to merge this as-is. I'll close and reopen to make sure CIs are still happy

rkmaddox commented 3 years ago

Thanks @larsoner

LABSN / expyfun

ENH: Allow triggering after onset #419

Codecov Report