NSLS-II / lsdc

LSDC - Life Sciences Data Collection
https://nsls-ii.github.io/lsdc
BSD 3-Clause "New" or "Revised" License
3 stars 14 forks source link

failing to arm detector raises description that crashes server #67

Open dalekreitler-bnl opened 3 years ago

dalekreitler-bnl commented 3 years ago

observed at AMX on 2Mar2021

Traceback

SystemExit                                Traceback (most recent call last)
/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/daq_main2.py in <module>
     43 if (len(sys.argv)>1):
     44   if (sys.argv[1] == "gui"):
---> 45     main(mode="gui")
     46 else:
     47   main(mode="shell")

/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/daq_main2.py in main(mode)
     36   pybass_init()
     37   if (mode=="gui"):
---> 38     run_server()
     39   else:
     40     lsdcHome = os.environ["LSDCHOME"]

/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/daq_main_common.py in run_server()
    181   comm_pv.add_callback(comm_cb)
    182   immediate_comm_pv.add_callback(comm2_cb)
--> 183   process_commands(0.5)

/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/daq_main_common.py in process_commands(frequency)
    103       command = command_list.pop(0)
    104       logger.info('command: %s' % command)
--> 105       process_input(command)
    106 
    107 

/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/daq_main_common.py in process_input(command_string)
    151   try:
    152     daq_lib.set_field("program_state","Program Busy")
--> 153     execute_command(command_string)
    154   except NameError as e:
    155     error_string = "Unknown command in queue: %s Error: %s" % (command_string, e)

/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/daq_main_common.py in execute_command(command_s)
     31 def execute_command(command_s):
     32   logger.info('executing command: %s' % command_s)
---> 33   exec(command_s)
     34 
     35 

/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/daq_main_common.py in <module>

/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/daq_lib.py in runDCQueue()
    495     refreshGuiTree() #just tells the GUI to repopulate the tree from the DB
    496     logger.info("calling collect data " + str(time.time()))
--> 497     colStatus = collectData(currentRequest)
    498     logger.info("done collecting data")
    499     if (autoMounted and db_lib.queueDone(daq_utils.beamline)):

/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/daq_lib.py in collectData(currentRequest)
    598   if (prot == "raster"):
    599     logger.info('entering raster')
--> 600     status = daq_macros.snakeRaster(currentRequest["uid"])
    601     logger.info('exiting raster')
    602   elif (prot == "stepRaster"):

/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/daq_macros.py in snakeRaster(rasterReqID, grain)
   1063     snakeRasterFine(rasterReqID,grain)
   1064   else:
-> 1065     snakeRasterNormal(rasterReqID,grain)
   1066 
   1067 def snakeRasterNoTile(rasterReqID,grain=""):

/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/daq_macros.py in snakeRasterNormal(rasterReqID, grain)
   1464   det_lib.detector_set_trigger_mode(3)
   1465   det_lib.detector_setImagesPerFile(numsteps)
-> 1466   daq_lib.detectorArm(omega,img_width_per_cell,totalImages,exptimePerCell,rasterFilePrefix,data_directory_name,file_number_start) #this waits
   1467   if not (daq_lib.setGovRobot('DA')):
   1468     if (daq_utils.beamline == "fmx"):

/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/daq_lib.py in detectorArm(angle_start, image_width, number_of_images, exposure_period, fileprefix, data_directory_name, file_number)
    837   detector_start() #but you need wired or manual trigger
    838   startArm = time.time()
--> 839   detector_waitArmed() #don't worry about this while we're not doing hardware triggers., not quite sure what it means
    840   endArm = time.time()
    841   armTime = endArm-startArm

/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/det_lib.py in detector_waitArmed()
     58 
     59 def detector_waitArmed():
---> 60   epics_det.det_waitArmed()
     61 #pixel array specific end
     62 

/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/epics_det.py in det_waitArmed()
    198       if time.time() - start_time > timeout:
    199         logger.info('aborting arm')
--> 200         sys.exit(1)
    201       time.sleep(.01)
    202 

SystemExit: 1

In [1]: Do you really want to exit ([y]/n)? 
/GPFS/CENTRAL/xf17id1/skinnerProjectsBackup/lsdc_amx/lsdcServer: line 2: 23669 Killed                  $LSDCHOME/daq_main2.py
Script done, file is lsdcServerLog.txt
dkreitler@xf17id2-ca1:/GPFS/CENTRAL/xf17id2/BAG-305042_01March2021$ 

How can we recover from this event?

dalekreitler-bnl commented 3 years ago

Server log at that time:

2021-03-02 06:09:10,934 daq_main_common INFO     executing command: center_on_click(315.0,219.0,1300.0,1040.0,"screen",0)
2021-03-02 06:09:11,020 daq_lib  INFO     checkC2C_X target: 12543.74975
2021-03-02 06:09:11,083 daq_lib  INFO     robot gov status = False
2021-03-02 06:09:22,439 daq_main_common INFO     immediate command: center_on_click(291.0,277.0,650.0,520.0,"screen",0)
2021-03-02 06:09:22,475 daq_lib  INFO     Tue Mar  2 06:09:22 2021
center_on_click(291.0,277.0,650.0,520.0,"screen",0)
2021-03-02 06:09:22,565 daq_main_common INFO     executing command: center_on_click(291.0,277.0,650.0,520.0,"screen",0)
2021-03-02 06:09:22,636 daq_lib  INFO     checkC2C_X target: 12449.296875
2021-03-02 06:09:22,730 daq_lib  INFO     robot gov status = False
2021-03-02 06:09:45,257 daq_main_common INFO     command: runDCQueue()
2021-03-02 06:09:45,268 daq_lib  INFO     Tue Mar  2 06:09:45 2021
runDCQueue()
2021-03-02 06:09:45,268 daq_main_common INFO     executing command: runDCQueue()
2021-03-02 06:09:45,268 daq_lib  INFO     running queue in daq server
2021-03-02 06:09:46,051 daq_lib  INFO     processing request 1614683386.0511315
2021-03-02 06:09:46,679 daq_lib  INFO     calling collect data 1614683386.6793838
2021-03-02 06:09:46,680 daq_lib  INFO     raster
2021-03-02 06:09:49,849 daq_lib  INFO     entering raster
2021-03-02 06:09:49,938 daq_lib  INFO     data directory = /GPFS/CENTRAL/xf17id2/BAG-305042_01March2021/mx305042-9/KRS-021/1/KRS002_5/
2021-03-02 06:09:49,938 det_lib  INFO     set detector exposure time 0.02
2021-03-02 06:09:49,939 det_lib  INFO     set detector period 0.02
2021-03-02 06:09:49,940 det_lib  INFO     set detector number of images 27
2021-03-02 06:09:49,940 det_lib  INFO     set detector file path /GPFS/CENTRAL/xf17id2/BAG-305042_01March2021/mx305042-9/KRS-021/1/KRS002_5/
2021-03-02 06:09:49,941 det_lib  INFO     set detector file prefix KRS-021_Raster
2021-03-02 06:09:49,942 det_lib  INFO     set detector file number 1
2021-03-02 06:09:49,943 det_lib  INFO     detector filehead
2021-03-02 06:09:49,944 det_lib  INFO     start detector
2021-03-02 06:09:49,945 epics_det INFO     armed = 0
2021-03-02 06:10:19,953 epics_det INFO     aborting arm
JunAishima commented 3 years ago

On the detector side, it is not responding to what should be normal input into the API. This suggests the Daq has failed (it does occasionally), so needs to be restarted.

RequestHandler: Processing detector request on parameter beam_center_y

RequestHandler: Parameter beam_center_y not found in NeXus tree. Should exist according to self._paramTable.py

- - - [2021-03-02 05:48:50] "PUT /detector/api/1.6.0/config/beam_center_y HTTP/1.0" 404 181 0.000696

RequestHandler: Processing detector request on parameter omega_increment

RequestHandler: Parameter omega_increment not found in NeXus tree. Should exist according to self._paramTable.py

- - - [2021-03-02 05:48:50] "PUT /detector/api/1.6.0/config/omega_increment HTTP/1.0" 404 183 0.001044

RequestHandler: Processing detector request on parameter omega_start

RequestHandler: Parameter omega_start not found in NeXus tree. Should exist according to self._paramTable.py

- - - [2021-03-02 05:48:50] "PUT /detector/api/1.6.0/config/omega_start HTTP/1.0" 404 179 0.000657

RequestHandler: Processing detector request on parameter detector_distance

RequestHandler: Parameter detector_distance not found in NeXus tree. Should exist according to self._paramTable.py

- - - [2021-03-02 05:48:50] "PUT /detector/api/1.6.0/config/detector_distance HTTP/1.0" 404 185 0.000994

RequestHandler: Processing detector request on parameter arm

RequestHandler: Processing internal/detector request on parameter arm

Traceback (most recent call last):

  File "../../../../../python/dectris/restapi/AbstractApi/DetectorCommandHandler.py", line 86, in process

  File "../../../../../python/dectris/restapi/DetectorApi/DetectorCommandHandler.py", line 68, in arm

  File "../../../../../python/dectris/jaun/rest/EigerDetector.py", line 337, in arm

RuntimeError: Error running arm

- - - [2021-03-02 05:48:50] "PUT /internal/detector/api/1.6.0/command/arm HTTP/1.1" 400 151 0.035733

- - - [2021-03-02 05:48:50] "PUT /detector/api/1.6.0/command/arm HTTP/1.0" 400 170 0.037617