Closed qzhang234 closed 4 years ago
One thing I was also thinking: is it possible to send out an email from Bluesky when a scan crashed or hung?
Yes. See EmailNotifications()
. (edited) This example is now in the docs:
from apstools.utils import EmailNotifications
SENDER_EMAIL = "8idiuser@aps.anl.gov"
email_notices = EmailNotifications(SENDER_EMAIL)
email_notices.add_addresses(
"joe.user@anl.gov",
"instrument_team@aps.anl.gov",
# others?
)
# then, when some condition occurs
if feedback_limits_approached:
subject = "Feedback problem"
message = "Feedback is very close to its limits."
email_notices.send(subject, message)
Is there a way to modify Bluesky so that it automatically moves on to the next measurement if XSPA does not respond for more than 60 s (each measurement should take no more than 5 s)?
If we can catch the timeout, for sure we can do this.
We might want to catch a sequence of n consecutive jams to make sure we do not retry a hopeless situation.
Yes. I would say 3 retries would be enough.
This Rigaku timeout bug has occurred twice with Bluesky in #223 and once with Spec this week. It appears to be a recurring and reproducible problem. Nakaye doesn't know the source of the bug so we'll have to fix it from our end. Hopefully implementing the Bluesky re-throw will permanently fix this bug.
Also now that I think about this, most of the jam or crash when operating Rigaku/Bluesky can be fixed by simply Ctrl+C and restart the plan. Maybe this implementation is the last step towards our milestone of one week of continuous Bluesky user operation.
The beam will be down next Monday (09/28) at 8 am and doesn't come back till Thursday 8 am (10/01). This is a great opportunity, so let's get this done before the beam is back up. @prjemian Please let me know if there's anything that I can help.
Thanks!
The same bug occurred again while running with Spec on 09/27, 11:35 pm. I'm therefore changing the label to 'high priority'.
It looks like our best chance is to run Bluesky for the week of 10/01 - 10/12 with the rethrow capacity implemented.
@prjemian There's no beam till 10/01. Please advise on how we should start working on this. Thanks!
So, we want to implement a timeout around a call to yield from AD_Acquire()
.
If timeout, then:
Just to leave a note that the LabView hang occurred again at 23:33 on 09/29
Commit 2d2d672 should also handle the ReadTimeout problems affecting #233
I noticed the same error on the LabView panel when I ran XSPA with Spec. Guess the jamming we saw in #223 two weeks ago is not a Bluesky problem.
This raises a question: Is there a way to modify Bluesky so that it automatically moves on to the next measurement if XSPA does not respond for more than 60 s (each measurement should take no more than 5 s)?