Closed t-sasatani closed 1 week ago
Seems to have trouble handling timeout with multiprocess. I'll re-look into this later.
Ill help out with this next week, I think we'll want this to be the default mode, but it will take some planning :)
That sounds good. There's no rush because I think the main issue is testing timeout, and we can use this branch for experiments in the meantime.
@sneakers-the-rat I opened and assigned reviews because it works on HW now, but we're not in a hurry at all to merge this FYI. Would appreciate help with this whenever you have time!
Just a note for the force pushes: long story short, the second force push reverts everything, so it shouldn't have any effect.
I squashed some commits to make the history a bit clearer but reverted them because I guess it's less confusing to do that after the rebase in #49.
@sneakers-the-rat, would you mind looking into this?
mp.queue.put
in a try block for detecting full queues. I'm looking into this, but I also think limiting the Python version to <3.12 is also an option.Other rather minor stuff
Metadata length asserts fail in two environments (Win-py3.12, Mac-py3.12) after wrapping the mp.queue.put in a try block for detecting full queues. I'm looking into this, but I also think limiting the Python version to <3.12 is also an option.
This appears to happen because the timeout of the queue put was too short (1 sec). Extending this to 5 sec made it go through all OS-Python version pairs. Not really sure where it took so much time.
Sorry for the delay, ill take a look at these Monday.
@sneakers-the-rat No pressure at all, but do you plan to review this soon? It's (hopefully) the last major PR we need for upcoming experiments so main
merged to this branch works for the experiments but just want to know what's your plan.
my bad, was just sketching out pipelines plan and had this open in the next tab. lemme take a look now
Actually, I was wondering why all other processes terminate even if _fpga_recv
is alive, and misunderstood that mp.queue.get
returns None
when empty (which wouldn't make sense with the exact_iter
stuff). So, I thought I'd need to modify the termination conditions of all processes to keep them alive.
I think this termination is actually caused by the queue.get
method returning an exception when it's empty, which get's captured by the try
and force enters the finally
block for each method. So I guess I just need to handle the errors happening in exact_iter(mp,queue.get, None)
I'll see if I can quickly fix this and merge even if not.
Ok, I removed the continuous mode flag and fixed some termination conditions because this should be continuous anyway, and it was terminating in a kind of unintended way (at least for me). To do this, I added error handling in the exact_iter
method and made the StreamReadError
raise an exception log and continue instead of terminating. Now termination propagates from _fpga_recv
, and the DAQ runs continuously even if the data stream is interrupted.
@sneakers-the-rat, would you mind quickly seeing if the updated termination logic aligns with your intention? (It is not absolutely absolutely necessary, and I guess we can merge as long as it's working, as you mentioned, so it's ok just if you have time)
~I need to run now, but I'll hang critical changes in the thread later today.~ Done; the critical changes are listed below.
This PR adds a continuous recording mode that collects data even when the video streaming breaks. This should be good for both actual recordings and collecting broken data. It also adds UNIX time to metadata, which is necessary because the DAQ now handles intermittent streams. It works on hardware now, so I'm opening this.
Updates
--timeout
flag.cv2
to allow continuous recording.snake_case
function names (Lint)Issues There are still two issues around CI/CD. I've been trying to make this work, but something about multiprocessing is making it a bit complex.
ctrl-c
, which is simpler, but making this propagation work could save the last few buffers (which might not be essential but still).~Also, (as @sneakers-the-rat mentioned, too) the default should be changed to continuous, so I'll flip this once
pytest
on GitHub actions work fine with a timeout.~ done