Closed qzhang234 closed 2 years ago
8idi1 IOC died?
I don't think that's the case, because I restarted the scan and everything is running now as normal. Unless EPICS suddenly dropped, which I don't think I have seen when I run with Spec.
QZ
From: Pete R Jemian @.> Sent: Tuesday, May 24, 2022 11:22 PM To: aps-8id-dys/ipython-8idiuser @.> Cc: Zhang, Qingteng @.>; Assign @.> Subject: Re: [aps-8id-dys/ipython-8idiuser] Bluesky crashed on 8idi:Reg169 (Issue #288)
8idi1 IOC died?
— Reply to this email directly, view it on GitHubhttps://github.com/aps-8id-dys/ipython-8idiuser/issues/288#issuecomment-1136709474, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALPJBQXU63F4LHLAHG5ZZEDVLWTH7ANCNFSM5W3WW4BA. You are receiving this because you were assigned.Message ID: @.***>
The first indication in what you showed w were the messages that the connection closed.
@prjemian You are right. It's also confirmed by Suresh on Teams.
Closing this issue now and will reopen if needed.
In our test procedure, we tested spec, pyepics, ophyd, and then jumped to the full AD_Acquire. Weren't we planning to try a much simpler bluesky acquisition, with simpler q area detector support? I believe we must be doing something that causes this.
We did test Eiger with the lightweight Bluesky detector definition. That one ran for 40,000 measurements before we manually stopped it
Note that at the time the IOC stopped (2022-05-24 19:18:17 according to the IOC monitoring server), we had 1591 iterations with no other problem:
(AD_Acquire): num_images=10000
(EigerDetector): num_images=10000
(EigerDetector): file_name=J048_01591_att00_Test
(EigerDetector): hdf.image_dir=/home/8ididata/2022-2/bluesky202205/J048_01591_att00_Test/
(EigerDetector): hdf1 stage_sigs=OrderedDict([('enable', 1), ('create_directory', -3), ('auto_increment', 'Yes'), ('array_counter', 0), ('auto_save', 'Yes'), ('num_capture', 10000), ('file_template', '%s%s_%4.4d.h5'), ('file_write_mode', 'Stream'), ('blocking_callbacks', 'Yes'), ('parent.cam.array_callbacks', 1), ('file_name', 'J048_01591_att00_Test'), ('file_path', '/home/8ididata/2022-2/bluesky202205/J048_01591_att00_Test/'), ('capture', 1)])
CA.Client.Exception...............................................
Warning: "Virtual circuit unresponsive"
Context: "ioc8idi1.xray.aps.anl.gov:5064"
Source File: ../tcpiiu.cpp line 925
Current Time: Tue May 24 2022 19:17:54.254819745
..................................................................
Unexpected problem with CA circuit to server "ioc8idi1.xray.aps.anl.gov:5064" was "Connection reset by peer" - disconnecting
CA.Client.Exception...............................................
Warning: "Virtual circuit disconnect"
Context: "ioc8idi1.xray.aps.anl.gov:5064"
Source File: ../cac.cpp line 1237
Current Time: Tue May 24 2022 19:18:09.880473649
..................................................................
dm_pars_source_end_datetime: set_and_wait(value=Tue 24 May 2022 07:17:30 PM, timeout=60, atol=None, rtol=None)
I am (mentally) continuing the success count from there.
Terminal_Output.txt
This error looks familiar as we had similar crashes with other detectors (Rigaku, Lambda) a year back.
Will restart the long scan to see if the crash is repeatable