Closed larsoner closed 3 years ago
Merging #419 (8baf53a) into main (d971681) will increase coverage by
1.35%
. The diff coverage is78.78%
.
@@ Coverage Diff @@
## main #419 +/- ##
==========================================
+ Coverage 86.85% 88.20% +1.35%
==========================================
Files 50 50
Lines 6673 6683 +10
Branches 1097 1099 +2
==========================================
+ Hits 5796 5895 +99
+ Misses 621 530 -91
- Partials 256 258 +2
We've used this in a couple experiments now. It seems to be working very well.
We are having a couple weird audio issues that started around the time we began using this, but we don't think are related:
While waiting for a trial to end using wait_secs
it usually is fine, but every now and then leads to a very long delay (can be up to 10 s) after playing a 1 s stimulus.
Rarely, the audio quality will start to sound stuttery (including trigger dropouts), in a way that feels like missed buffers. But the log file shows no missed buffers and the RME drivers seem to think there haven't been any either. We are not using a particularly aggressive buffer size (96 samples at 48k).
Both of these problems are very intermittent and not likely related, but seem worth mentioning since they are happening in scripts using this branch.
The wait_secs
sounds like the Python interpreter gets caught in some other thread and cannot return. The stuttery sound is weird since I think python-rtmixer
does not need the GIL and so its thread should have no problem pumping the NumPy data into the sound card buffers on time.
I don't know where we would start to debug these, which is not great since they seem like pressing problems. Maybe we can make some minimal example using plain python-rtmixer
and open an issue there -- but if it only happens sometimes, and only after a long time, that's not a great test case.
Maybe a first attempt at MWE is to absolutely kill your CPU with some other processes (is SETI@home still a thing?) while running an experiment to see if it easily replicates, then we can work on whittling it down...
Pegging the CPU with other stuff is a good idea. It just doesn't make much sense. We give our process high priority in win, and CPU and RAM are both not even close to fully utilized when this happens, though it could be a single core getting hit for some reason.
Using ec.wait_secs
. We are going to try it with the time.sleep
commented out. It sure doesn't seem like that would matter, but if the problems we're seeing are both from wait_secs
, which would make sense since that is what we are using to wait while a stimulus plays, then the addition of that sleep
command is the only thing that changed.
@rkmaddox and I are pretty confident the issues we mentioned previously are not from changes in this branch. We've reverted the changes to wait_secs
and the issues persisted. The stuttering issue typically happens at most once per experiment (experiments are 2.5 hours long).
Bombing the CPU didn't seem to do anything, although I'm not sure the best way to test it. I ran the batch file below a bunch of times simultaneously (at least as many times as there are cores), but the CPU usage always hovered around 90%. Setting the python script to below normal priority made no difference.
@echo off
:loop
goto loop
The stuttering issue typically happens at most once per experiment (experiments are 2.5 hours long).
One more thing I can think of is, during some point you're going to wait anyway (ISI maybe?) try adding a:
import gc, time
t0 = time.time()
gc.collect()
print(f'Garbage collected in {(time.time() - t0) * 1000:0.1f} ms')
This will force garbage collection. Python garbage collects at unknown times, and maybe at some point you're getting hit by a garbage collection that takes a really long time for some reason. I have no idea why this would hang -- and python-rtmixer
shouldn't break when this happens anyway -- but it's worth a shot anyway.
Also, have you checked the logs to see when the hangup occurs? In particular I would read_tab
and look at the info there somehow to figure out when there was a problem, and once you have that, manually look at the .tab
and .log
files to see where the slowdown is. If it's not clear how to do this, if you upload a .tab/.log
pair from a problematic experiment I can look and see if I can find anything.
Ultimately, though, I think this is probably some python-rtmixer bug. I wonder if we're hitting some variant of https://github.com/spatialaudio/python-rtmixer/issues/9#issuecomment-489175489. But the problem is that it's not deterministic and takes a long time to hit, so it's terrible to debug!
... in the meantime I'm inclined to merge this as-is. I'll close and reopen to make sure CIs are still happy
Thanks @larsoner
SOUND_CARD_TRIGGER_ID_AFTER_ONSET
config var to allow stamping the TTL IDs after the 1 trigger so that a single sound can be played. Should help with rtmixer/portaudio fragility when sounds are stopped and started in close proximity.stamp_triggers
for the sound controller to return as quickly as possible whenwait_for_last=False
. We had made it wait until all but the last trigger was stamped to be consistent with TDT and parallel triggers. It seems better not to do this if we don't need to -- it decreases the uniformity of our approach across platforms, but makes the sound card platform (probably the way of the future) work better.wait_secs
when not used in anec
context. Probably doesn't matter but I noticed this bug was introduced in #399.