APS-USAXS / livedata

live data from the APS USAXS instrument
https://usaxslive.xray.aps.anl.gov/
0 stars 0 forks source link

No flyscan plots from SPEC nowe #55

Closed prjemian closed 2 years ago

prjemian commented 2 years ago

Not getting any livedata plots for Fly scans from SPEC. 240% CPU time used by process that generates the plot.

Related to:

Something wrong with specplotsAllScans.py, so leaving this issue open as a BUG report now.

Originally posted by @prjemian in https://github.com/APS-USAXS/livedata/issues/50#issuecomment-748408721

That one is started by this cron task

# every five minutes (generates no output from outer script)
0-59/5 * * * *  /home/beams/USAXS/Documents/eclipse/USAXS/livedata/buildSpecPlots.sh 2>&1 > /dev/null
prjemian commented 2 years ago

[5:27 PM] Ilavsky, Jan I think the problem is that the Flyscan data conversion is failing. No Flyscan spec plots are being created. SAXS and WAXS are fine, but all Flyscans in 10_05_Gadikota are failing. We broke something and python is hanging with giant cpu use.
Note, they are running in spec now, that is what they know. So this is the standard spec flyscans with HDF5 files.

Live data does have flyscan plots, so this is specifically for spec plots (2021/10/10_05_Gadikota).

prjemian commented 2 years ago

Probably related to recent changes in SPEC code and fly scans. This used to work.

prjemian commented 2 years ago

More notes from top and ps commands:

top - 18:09:41 up 50 days, 14:57, 23 users,  load average: 4.07, 4.37, 4.29
Tasks: 514 total,   3 running, 511 sleeping,   0 stopped,   0 zombie
%Cpu(s): 42.0 us, 13.2 sy,  0.0 ni, 44.7 id,  0.0 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem : 32488920 total,  1321784 free,  6085480 used, 25081656 buff/cache
KiB Swap: 33554428 total, 33137404 free,   417024 used. 25931836 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                 
22328 usaxs     20   0 1058292 375952  16888 R 225.8  1.2  10:01.22 python                                                                  
 9359 usaxs     20   0 1612144 305220  20556 R 122.5  0.9  68:53.85 python                                                                  
18804 usaxs     20   0 4298972 378592  59848 S  32.1  1.2   1748:26 gnome-shell                                                             
#...
bash-4.2$ ps 22328
  PID TTY      STAT   TIME COMMAND
22328 ?        Rl    10:14 /APSshare/anaconda/x86_64/bin/python /home/beams/USAXS/Documents/eclipse/USAXS/livedata/specplotsAllScans.py /sha
bash-4.2$ ps 9359
  PID TTY      STAT   TIME COMMAND
 9359 pts/13   Rl    70:08 /APSshare/anaconda/x86_64/bin/python /home/beams/USAXS/Documents/eclipse/USAXS/livedata/pvwatch.py
prjemian commented 2 years ago

Will examine Py2.7 files in directory: /home/beams/USAXS/Documents/eclipse/USAXS/livedata/, reduceFlyData.py and developer_test_reduceFlyData.py

prjemian commented 2 years ago

Nothing unusual running developer_test_reduceFlyData.py with test file

test_files = ["/share1/USAXS_data/2021-10/10_05_Gadikota/10_05_Gadikota_usaxs/AAM_50mM_CaCO3_206min_1039.h5", ]
prjemian commented 2 years ago

Need to learn why livedata/specplotsAllScans.py is using more than 200% CPU.

prjemian commented 2 years ago

here:

bash-4.2$ git grep specplotsAllScans
archive/docs/index.html:                specplotsAllScans.py
archive/docs/index.html:        <dt><a href="specplotsAllScans.html"><pre>specplotsAllScans.py</pre></a></dt>
archive/docs/index.md:            specplotsAllScans.py
archive/docs/index.md:specplotsAllScans.py | default plots of all scans in one SPEC data file
archive/docs/specplotsAllScans.html:<html><head><title>Python: module specplotsAllScans</title>
archive/docs/specplotsAllScans.html:<font color="#ffffff" face="helvetica, arial">&nbsp;<br><big><big><strong>specplotsAllScans</strong></bi
archive/docs/specplotsAllScans.html:><font color="#ffffff" face="helvetica, arial"><a href=".">index</a><br><a href="file:/home/beams21/S15U
archive/docs/specplotsAllScans.html:<td width="100%"><strong>SVN_ID</strong> = '$Id: specplotsAllScans.py 376 2010-07-23 23:20:40Z jemian $'
buildSpecPlots.sh:# kill -9 `psg python | awk '/specplotsAllScans.py/ {print $2}' -`
buildSpecPlots.sh:PROGRAM=$CODE_DIR/specplotsAllScans.py
developer_test_specplotsAllScans.py:'''developer routine to test specplotsAllScans.py'''
developer_test_specplotsAllScans.py:import specplotsAllScans
developer_test_specplotsAllScans.py:specplotsAllScans.main()
prjemian commented 2 years ago

Current data file has more than 4,000 SPEC scans in it!

bash-4.2$ grep "#S "  /share1/USAXS_data/2021-10/10_05_Gadikota/10_05_Gadikota.dat | wc -l
4098

Most recent is:

#S 4096  FlyScan  ar 8.74299 0 7.12597 2.3e-05 
#S 4097  SAXS  ./10_05_Gadikota_saxs/AAM_75mM_CaCO3_107min_1096.hdf    0    75    0.06    20     1 
#S 4098  WAXS  ./10_05_Gadikota_waxs/AAM_75mM_CaCO3_107min_1096.hdf     0    75    0.06    20     1 

The first place the code slows down is when the SPEC file is read (by spec2nexus): https://github.com/APS-USAXS/livedata/blob/834226bddb31b3594b90a5f34f008367affd5adc/specplotsAllScans.py#L107

Simply not optimized for thousands of scans in a single SPEC file. Yikes!

prjemian commented 2 years ago

Second place the code slows down is when looping over all the scans in the SPEC file: https://github.com/APS-USAXS/livedata/blob/834226bddb31b3594b90a5f34f008367affd5adc/specplotsAllScans.py#L138-L143

There is a TODO comment that the code needs to be optimized. Since this code is Py2.7 which is well beyond end-of-life, the code is a prime candidate for refactoring and use of python3.9 (most modern production release).

Is there something we can do now since a rewrite of livedata in py3.9 is a big job? Conversion of existing code to Py3.9 is quicker but the methods used here are now out of date and should be rethought.

prjemian commented 2 years ago

Findings: Why so much CPU is being used?

  1. The existing code is slow to read a SPEC file with thousands of scans such as the USAXS instrument is generating now.
  2. The code examines and replots every scan in the SPEC file each time it runs as long as there is new data in the file. (comparison: The time the data file was last modified is greater than the time the directory with the plots was last modified.)

Further:

  1. The current code base for livedata uses Python 2.7 which is end-of-life.

None of this explains why Fly scans from SPEC aren't being plotted now. Will keep looking for that.

prjemian commented 2 years ago

This code expects to find the HDF5 file name in the 3rd comment (python reference [2]): https://github.com/APS-USAXS/livedata/blob/834226bddb31b3594b90a5f34f008367affd5adc/specplot.py#L37-L40

But now, there is an additional comment in the SPEC scan:

#S 9  FlyScan  ar 8.74297 0 7.12596 2.3e-05 
#D Tue Oct 05 10:00:25 2021
#T 0.45  (seconds)
#G0 0
#G1 0
#G3 0
#G4 0
#Q 
#P0 -22.29064 20.2 0 0 -49 14.602783 185 3.2
#P1 5.894899e-07 5.00006 0 11.00057 0.18 -4 0.3005 29.428909
#P2 20 0 68.9325 26.355 -0.300098 8e-06 0.7999205 0.7999188
#P3 0 44.999661 0.5 -0.5 0.5 -0.5 20.000182 260
#P4 35.6487 0 0 0 0 0 0 0
#P5 0 0 0 0 8.7413585 -159.79288 88.550355 0
#P6 0 0 0 0 0 0 13.047491 21
#P7 3.7150163 4.2852875 4.7999125 4.1998625 8.858101 11.75625 23.082425 0
#C EmptyNMRTube1
#V0 101.93 0 1 0 1 1 1 1 0
#V1 -0.215743 0.0214993 72.4577 13.98 0 0 0 -0.0011394 -0.00239563
#V2 1.99486 -3.61273 -124.587 187.976
#V3 21 0.590401 21.254 0 0 0 8.741359
#V4 8.8581 8.74136 24.7627 22.5728 0 0
#V5 10000000 1000000000
#V6 2 0 100000 1.0e+06 1 5
#V7 1.0e+04 5 0
#V8 1.0e+06 5 0
#V9 1.0e+08 6 2.23607
#V10 1.0e+10 633 70.5868
#V11 1.0e+12 72733 2334.69
#V12 4 8.74131 212 912 0 29.45
#V13 68.9325 13.0475 0.65 
#V14 1 9.82365 3 389820 100000 2032510 100000000
#N 16
#L ar  pd_rate  pd_range  pd_curent  pd_counts  dy  ay  ar_enc  I00_gain  I0_gain  Epoch  seconds  I00  TR_diode  I000  I0  USAXS_PD
#C Tue Oct 05 10:00:25 2021.  Collected FLY2 jpeg/tiff data.
#C Tue Oct 05 10:00:25 2021.  FlyScan #9 started.
#C Tue Oct 05 10:00:25 2021.  FlyScan file name = ./10_05_Gadikota_usaxs/EmptyNMRTube1_0392.h5.
#C Tue Oct 05 10:00:25 2021.  Please wait approximately 105s to complete FlyScan..
#C Tue Oct 05 10:02:06 2021.  Flyscan done after 99s..
#C Tue Oct 05 10:02:06 2021.  FlyScan finished.
#C Tue Oct 05 10:02:06 2021.  Moving AR motor back to 8.74136 degrees (beam center).
#C Tue Oct 05 10:02:06 2021.  EmptyNMRTube1.
#C Tue Oct 05 10:02:07 2021.  Ready for USAXS mode.
#C Tue Oct 05 10:02:07 2021.  Moving USAXS slits and guard slits to correct place.
#C Tue Oct 05 10:02:07 2021.  Moving AR motor to 8.74136 degrees (beam center).
#C Tue Oct 05 10:02:08 2021.  for next scan: encoder ARenc_CENTER  =  8.7414.
#C Tue Oct 05 10:02:09 2021.  Ready for USAXS mode.

Easy enough to shift this to look at the next line. But specifying an explicit comment number is still fragile. (What if someone adds another comment early on?) Best to search all the comments for the first one with the key_string match.

Fix this part and Fly scans will again be plotted. Still the code will be slow but that is an optimization, not a bug.

prjemian commented 2 years ago

Now the fly scan plots appear on the livedata page.