Timeout exception - Githubissues

alexpiet commented 7 months ago

Describe the bug We use this repo to control the lickspout stages in the AIND training room. We occasionally get this error. It seems like we might be able to solve this by extending the acceptable time window.

To Reproduce I cannot reproduce this error, but its happening about twice a week in the AIND behavior room.

Screenshots

Traceback (most recent call last):
  File "C:\Users\svc_aind_behavior\Documents\GitHub\dynamic-foraging-task\src\foraging_gui\stage.py", line 37, in run
    fc.execute()
  File "C:\Users\svc_aind_behavior\Documents\GitHub\dynamic-foraging-task\src\foraging_gui\io_commands.py", line 106, in execute
    pos = self.device.get_position('x', 'y', 'z')
  File "C:\Users\svc_aind_behavior\AppData\Local\miniconda3\envs\Foraging\lib\site-packages\newscale\multistage.py", line 34, in inner
    return func(self, *args, **kwargs)
  File "C:\Users\svc_aind_behavior\AppData\Local\miniconda3\envs\Foraging\lib\site-packages\newscale\multistage.py", line 231, in get_position
    return {x: self.stages[x].get_position() for x in axes}
  File "C:\Users\svc_aind_behavior\AppData\Local\miniconda3\envs\Foraging\lib\site-packages\newscale\multistage.py", line 231, in <dictcomp>
    return {x: self.stages[x].get_position() for x in axes}
  File "C:\Users\svc_aind_behavior\AppData\Local\miniconda3\envs\Foraging\lib\site-packages\newscale\stage.py", line 336, in get_position
    _, _, pos, _ = self._send(self._get_cmd_str(Cmd.CLOSED_LOOP_STATE))
  File "C:\Users\svc_aind_behavior\AppData\Local\miniconda3\envs\Foraging\lib\site-packages\newscale\stage.py", line 93, in _send
    return self._parse_reply(self.interface.read(self.address))
  File "C:\Users\svc_aind_behavior\AppData\Local\miniconda3\envs\Foraging\lib\site-packages\newscale\interfaces.py", line 205, in read
    data = self.serial.readLine()
  File "C:\Users\svc_aind_behavior\Documents\GitHub\dynamic-foraging-task\src\foraging_gui\MyFunctions.py", line 1617, in readLine
    c = self.io.read(1).decode()
  File "C:\Users\svc_aind_behavior\AppData\Local\miniconda3\envs\Foraging\lib\site-packages\newscale\usbxpress.py", line 407, in read
    self._dll.SI_Read(self._handle,
  File "C:\Users\svc_aind_behavior\AppData\Local\miniconda3\envs\Foraging\lib\site-packages\newscale\usbxpress.py", line 59, in usbxpress_errcheck
    raise USBXpressTimeoutException("USBXpress operation timed out")
newscale.usbxpress.USBXpressTimeoutException

Desktop (please complete the following information):

OS: Windows
axes-on-target branch

alexpiet commented 7 months ago

This timeout happened yesterday afternoon, and this morning the newscale still won't connect. I have seen this pattern before where the newscale stages become unresponsive for long intervals of time (hours). Restarting sometimes helps. Its unclear where the bug is. Its possible this and #9 are really the same issue

last night: https://github.com/AllenNeuralDynamics/aind-behavior-blog/issues/199#issuecomment-1942862193

this morning https://github.com/AllenNeuralDynamics/aind-behavior-blog/issues/253

Poofjunior commented 7 months ago

Ok this looks like the stage is (occasionally?) failing to reply to a particular command issued from the PC side.

Do you know if the stage is being called in a multithreading environment? (if multiple commands are issued and not serialized/deserialized correctly, something could get lost.)

Can you send me a link to where this code is being used?

Finally, do you have any logs written to disk with timestamps? This library logs all transactions, but they aren't caught unless you have a log handler to catch them and send them somewhere like a file. It would be super useful to see a time series of all transactions. If you don't have this, it's super easy to setup.

In your main code somewhere, you just need to add this somewhere:


import logging

class NewScaleFilter(logging.Filter):
    # Note: add additional modules that we want to catch here.
    VALID_LOGGER_BASES = {'stage', 'multistage'}

    def filter(self, record):
        """Returns true for a record that matches a log we want to keep."""
        return record.name.split('.')[0].lower() in \
            self.__class__.VALID_LOGGER_BASES

logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.FileHandler("newscale_logs.txt", 'w'))
#logger.addHandler(logging.StreamHandler())
log_handler.addFilter(NewScaleFilter())
logger.handlers[-1].setFormatter(
   logging.Formatter(fmt='%(asctime)s:%(name)s:%(levelname)s: %(message)s'))

# main script code here...

alexpiet commented 7 months ago

We connect to the stage here: https://github.com/AllenNeuralDynamics/dynamic-foraging-task/blob/main/src/foraging_gui/Foraging.py#L474

Which then creates this object and moves the control to a new thread here: https://github.com/AllenNeuralDynamics/dynamic-foraging-task/blob/main/src/foraging_gui/stage.py#L61

We have a log, but it is set to "INFO" not "DEBUG". I can go run a session with it on DEBUG and send it to you

alexpiet commented 7 months ago

Log file with DEBUG 447-1-A_gui_log_2024-02-14_11-47-23.txt

relevant part

11:47:27 AM.118 :INFO:Foraging:Foraging.py:_InitializeMotorStage:line 423:Scanning for newscale stages
11:47:27 AM.121 :INFO:Foraging:Foraging.py:_InitializeMotorStage:line 437:found 2 newscale stages
11:47:27 AM.121 :INFO:Foraging:Foraging.py:_InitializeMotorStage:line 461:Found the newscale stage from the settings file
11:47:27 AM.129 :DEBUG:interfaces:interfaces.py:__init__:line 32:Handshaking with hardware interface.
11:47:27 AM.129 :DEBUG:interfaces:interfaces.py:send:line 193:Sending: 'TR<01>\r' encoded as bytearray(b'\xbc\xb0\xb1\xbe\x8d')
11:47:27 AM.133 :DEBUG:interfaces:interfaces.py:read:line 215:Read back '<01 1 VER 2.1.3-MLTPX>\r'
11:47:27 AM.133 :DEBUG:interfaces:interfaces.py:__init__:line 35:Transceiver firmware: VER 2.1.3-MLTPX
11:47:27 AM.133 :DEBUG:stage:stage.py:__init__:line 47:Establishing host control by requesting firmware version.
11:47:27 AM.133 :DEBUG:interfaces:interfaces.py:_select_stage:line 41:Selecting stage at address: 01
11:47:27 AM.133 :DEBUG:interfaces:interfaces.py:send:line 193:Sending: 'TR<A0 01>\r' encoded as bytearray(b'\xbc\xc1\xb0\xa0\xb0\xb1\xbe\x8d')
11:47:27 AM.156 :DEBUG:interfaces:interfaces.py:read:line 215:Read back '<A0 01 0>\r'
11:47:27 AM.156 :ERROR:Foraging:Foraging.py:_connect_stage:line 482:Device at address 01 is not present on this interface.
11:47:27 AM.157 :INFO:Foraging:Foraging.py:_GetPositions:line 330:GetPositions pressed, but no current stage

screenshot of manual attempt capture

alexpiet commented 6 months ago

@Poofjunior Have you had any time to work on this issue?

alexpiet commented 6 months ago

It looks like replacing the gray box, and not the stage itself resolves these issues: https://github.com/AllenNeuralDynamics/aind-behavior-blog/issues/253

AllenNeuralDynamics / python-newscale

Timeout exception #10