Open larsoner opened 9 years ago
Thank you. This is a very clear description. The part about the "e" and the "3" might be particularly relevant. That could be a hint that the timer response is sitting in the same queue where the responses sit. Timer replies contain "e3" in them.
We (Cedrus) will test this at some point over the next few days. I do expect that this will be reproducible on our end.
After testing, I expect that we can fix this, either by patching pyxid or by suggesting some restructuring/reordering of function calls in the script you are writing (i.e. the modified demo script).
I will get back to you within a few days.
Meanwhile, since you clearly have "the chops" for debugging python, you might benefit from knowing what the actual expected bytes emanating from the RB device are supposed to look like.
http://cedrus.com/xid/protocols.htm http://cedrus.com/xid/timing.htm http://cedrus.com/xid/a_summary.htm http://cedrus.com/xid/miscellaneous.htm
Great, I probably won't have time to dig into the issue in the next few days (maybe in a couple of weeks), so let me know if you make some progress.
I actually have a branch going to make this package usable in Py3k that I can open shortly. Good to know this is actively monitored :)
Do you have any information regarding the changes made to the x40 series related to the timer drift issue mentioned in the README? I'd be happy to update the README.txt as part of my PR. Our custom presentation software currently relies on each device having a reliable clock, so I'm hoping that the x40 series does indeed have the drift issue resolved.
Also note that the minimal example/change above is not an exact replica of the set of calls we make, so reordering of calls might not be an option in our current structure. Fixing the underlying code so that calls to reset_base_timer
, reset_rt_timer
, query_base_timer
, poll_for_response
, plus button presses can be done in arbitrary orders would be the best solution. (get_next_response
is also used here, but I noticed it's a simple list .pop
operation so shouldn't really matter here.)
Ahh, looks like it might happen when there is an event in the queue waiting to be read... sending the e3
command works, but the response message is behind a keypress event. Adding dev.con.read_nonblocking(65536)
before and after the dev.query_base_timer()
helps alleviate the problem, but doesn't eliminate it (e.g., if a button is pressed in the <1 ms interval between when the queue is cleared and when the query_base_timer executes). Changing query_base_timer
to consume all bytes and use only the last 6 would work, but it would drop button presses.
I'm not sure if there is an easy solution in the current framework. It is set up to allow some functions (BaseDevice.query_base_timer
-> XidConnection.send_xid_command
) to send commands and read the n_bytes
of the current response assuming the serial buffer only consists of the response from the command (i.e., not from a recent button press), while others (XidConnection.check_for_keypress
) assume that the only items in the buffer will be keypresses. It might be worth restructuring internal.py
to make it so that all read commands are appended to a single buffer in XidConnection
, and then consumed as necessary by functions that need responses. For example, send_xid_command
would read all relevant bytes in (not just the next 6), and know just to return the last 6 in the list. It could thus append all but the last 6 bytes to the global buffer. The button-press function would then be restructured to 1) read in all pending bytes, and 2) traverse the list global bytes list, converting everything it finds into presses (since that's all that should exist). My 2c, anyway.
I can reproduce this (which is not surprising by now to anyone reading the ongoing analysis). I also confirmed that this same structural problem (assumptions about which inquiry owns which bytes in the buffer) also exists in our c++ project: https://github.com/cedrus-opensource/xid_device_library . The c++ project mainly exists to implement the Response Device Extension used in the NBS Presentation software. In that context, these function calls are never made in this order.
I agree that restructuring the code so that the various function calls can be done in arbitrary orders is ideal.
This has been logged as an issue in our internal Cedrus bugtracker.
Thanks for the info @kkheller, looking forward to eventual progress.
This isn't a show-stopping bug for us, because there is a workaround. However, this one is: https://github.com/cedrus-opensource/pyxid/issues/2#issuecomment-91077270. I could measure the drift many times to establish if it's consistent and, if it is, compensate for it, but that is an ugly solution so I'd rather not do it if there is a better fix or if I'm being dense.
A simple modification of the demo script leads to an error:
Pressing some buttons results in:
In other words, the
query_base_timer()
is causing some sort of problem, which can then break. I think there is some hint with the 'port' being incorrect, since it shouldn't change ports at all. Without thequery_base_timer
call, there is no error, and the port is always zero.This is on Linux with a 740 box, if it matters. I haven't tested to see if it also happens on other platforms, but I suspect it will. If it does, it means that we cannot really use the box -- our software needs to periodically check the timer for drift, and doing so seems to prevent proper button press recording currently. FWIW it looks like the errant message has key "e" instead of "k" (from the
(k, params, time) = unpack
line inxid_input_found
frominternal.py
).