Closed prjemian closed 2 years ago
[5:27 PM] Ilavsky, Jan I think the problem is that the Flyscan data conversion is failing. No Flyscan spec plots are being created. SAXS and WAXS are fine, but all Flyscans in 10_05_Gadikota are failing. We broke something and python is hanging with giant cpu use.
Note, they are running in spec now, that is what they know. So this is the standard spec flyscans with HDF5 files.Live data does have flyscan plots, so this is specifically for spec plots (2021/10/10_05_Gadikota).
Probably related to recent changes in SPEC code and fly scans. This used to work.
More notes from top
and ps
commands:
top - 18:09:41 up 50 days, 14:57, 23 users, load average: 4.07, 4.37, 4.29
Tasks: 514 total, 3 running, 511 sleeping, 0 stopped, 0 zombie
%Cpu(s): 42.0 us, 13.2 sy, 0.0 ni, 44.7 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem : 32488920 total, 1321784 free, 6085480 used, 25081656 buff/cache
KiB Swap: 33554428 total, 33137404 free, 417024 used. 25931836 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
22328 usaxs 20 0 1058292 375952 16888 R 225.8 1.2 10:01.22 python
9359 usaxs 20 0 1612144 305220 20556 R 122.5 0.9 68:53.85 python
18804 usaxs 20 0 4298972 378592 59848 S 32.1 1.2 1748:26 gnome-shell
#...
bash-4.2$ ps 22328
PID TTY STAT TIME COMMAND
22328 ? Rl 10:14 /APSshare/anaconda/x86_64/bin/python /home/beams/USAXS/Documents/eclipse/USAXS/livedata/specplotsAllScans.py /sha
bash-4.2$ ps 9359
PID TTY STAT TIME COMMAND
9359 pts/13 Rl 70:08 /APSshare/anaconda/x86_64/bin/python /home/beams/USAXS/Documents/eclipse/USAXS/livedata/pvwatch.py
Will examine Py2.7 files in directory: /home/beams/USAXS/Documents/eclipse/USAXS/livedata/
, reduceFlyData.py
and developer_test_reduceFlyData.py
Nothing unusual running developer_test_reduceFlyData.py
with test file
test_files = ["/share1/USAXS_data/2021-10/10_05_Gadikota/10_05_Gadikota_usaxs/AAM_50mM_CaCO3_206min_1039.h5", ]
Need to learn why livedata/specplotsAllScans.py
is using more than 200% CPU.
here:
bash-4.2$ git grep specplotsAllScans
archive/docs/index.html: specplotsAllScans.py
archive/docs/index.html: <dt><a href="specplotsAllScans.html"><pre>specplotsAllScans.py</pre></a></dt>
archive/docs/index.md: specplotsAllScans.py
archive/docs/index.md:specplotsAllScans.py | default plots of all scans in one SPEC data file
archive/docs/specplotsAllScans.html:<html><head><title>Python: module specplotsAllScans</title>
archive/docs/specplotsAllScans.html:<font color="#ffffff" face="helvetica, arial"> <br><big><big><strong>specplotsAllScans</strong></bi
archive/docs/specplotsAllScans.html:><font color="#ffffff" face="helvetica, arial"><a href=".">index</a><br><a href="file:/home/beams21/S15U
archive/docs/specplotsAllScans.html:<td width="100%"><strong>SVN_ID</strong> = '$Id: specplotsAllScans.py 376 2010-07-23 23:20:40Z jemian $'
buildSpecPlots.sh:# kill -9 `psg python | awk '/specplotsAllScans.py/ {print $2}' -`
buildSpecPlots.sh:PROGRAM=$CODE_DIR/specplotsAllScans.py
developer_test_specplotsAllScans.py:'''developer routine to test specplotsAllScans.py'''
developer_test_specplotsAllScans.py:import specplotsAllScans
developer_test_specplotsAllScans.py:specplotsAllScans.main()
Current data file has more than 4,000 SPEC scans in it!
bash-4.2$ grep "#S " /share1/USAXS_data/2021-10/10_05_Gadikota/10_05_Gadikota.dat | wc -l
4098
Most recent is:
#S 4096 FlyScan ar 8.74299 0 7.12597 2.3e-05
#S 4097 SAXS ./10_05_Gadikota_saxs/AAM_75mM_CaCO3_107min_1096.hdf 0 75 0.06 20 1
#S 4098 WAXS ./10_05_Gadikota_waxs/AAM_75mM_CaCO3_107min_1096.hdf 0 75 0.06 20 1
The first place the code slows down is when the SPEC file is read (by spec2nexus): https://github.com/APS-USAXS/livedata/blob/834226bddb31b3594b90a5f34f008367affd5adc/specplotsAllScans.py#L107
Simply not optimized for thousands of scans in a single SPEC file. Yikes!
Second place the code slows down is when looping over all the scans in the SPEC file: https://github.com/APS-USAXS/livedata/blob/834226bddb31b3594b90a5f34f008367affd5adc/specplotsAllScans.py#L138-L143
There is a TODO comment that the code needs to be optimized. Since this code is Py2.7 which is well beyond end-of-life, the code is a prime candidate for refactoring and use of python3.9 (most modern production release).
Is there something we can do now since a rewrite of livedata in py3.9 is a big job? Conversion of existing code to Py3.9 is quicker but the methods used here are now out of date and should be rethought.
Further:
None of this explains why Fly scans from SPEC aren't being plotted now. Will keep looking for that.
This code expects to find the HDF5 file name in the 3rd comment (python reference [2]
): https://github.com/APS-USAXS/livedata/blob/834226bddb31b3594b90a5f34f008367affd5adc/specplot.py#L37-L40
But now, there is an additional comment in the SPEC scan:
#S 9 FlyScan ar 8.74297 0 7.12596 2.3e-05
#D Tue Oct 05 10:00:25 2021
#T 0.45 (seconds)
#G0 0
#G1 0
#G3 0
#G4 0
#Q
#P0 -22.29064 20.2 0 0 -49 14.602783 185 3.2
#P1 5.894899e-07 5.00006 0 11.00057 0.18 -4 0.3005 29.428909
#P2 20 0 68.9325 26.355 -0.300098 8e-06 0.7999205 0.7999188
#P3 0 44.999661 0.5 -0.5 0.5 -0.5 20.000182 260
#P4 35.6487 0 0 0 0 0 0 0
#P5 0 0 0 0 8.7413585 -159.79288 88.550355 0
#P6 0 0 0 0 0 0 13.047491 21
#P7 3.7150163 4.2852875 4.7999125 4.1998625 8.858101 11.75625 23.082425 0
#C EmptyNMRTube1
#V0 101.93 0 1 0 1 1 1 1 0
#V1 -0.215743 0.0214993 72.4577 13.98 0 0 0 -0.0011394 -0.00239563
#V2 1.99486 -3.61273 -124.587 187.976
#V3 21 0.590401 21.254 0 0 0 8.741359
#V4 8.8581 8.74136 24.7627 22.5728 0 0
#V5 10000000 1000000000
#V6 2 0 100000 1.0e+06 1 5
#V7 1.0e+04 5 0
#V8 1.0e+06 5 0
#V9 1.0e+08 6 2.23607
#V10 1.0e+10 633 70.5868
#V11 1.0e+12 72733 2334.69
#V12 4 8.74131 212 912 0 29.45
#V13 68.9325 13.0475 0.65
#V14 1 9.82365 3 389820 100000 2032510 100000000
#N 16
#L ar pd_rate pd_range pd_curent pd_counts dy ay ar_enc I00_gain I0_gain Epoch seconds I00 TR_diode I000 I0 USAXS_PD
#C Tue Oct 05 10:00:25 2021. Collected FLY2 jpeg/tiff data.
#C Tue Oct 05 10:00:25 2021. FlyScan #9 started.
#C Tue Oct 05 10:00:25 2021. FlyScan file name = ./10_05_Gadikota_usaxs/EmptyNMRTube1_0392.h5.
#C Tue Oct 05 10:00:25 2021. Please wait approximately 105s to complete FlyScan..
#C Tue Oct 05 10:02:06 2021. Flyscan done after 99s..
#C Tue Oct 05 10:02:06 2021. FlyScan finished.
#C Tue Oct 05 10:02:06 2021. Moving AR motor back to 8.74136 degrees (beam center).
#C Tue Oct 05 10:02:06 2021. EmptyNMRTube1.
#C Tue Oct 05 10:02:07 2021. Ready for USAXS mode.
#C Tue Oct 05 10:02:07 2021. Moving USAXS slits and guard slits to correct place.
#C Tue Oct 05 10:02:07 2021. Moving AR motor to 8.74136 degrees (beam center).
#C Tue Oct 05 10:02:08 2021. for next scan: encoder ARenc_CENTER = 8.7414.
#C Tue Oct 05 10:02:09 2021. Ready for USAXS mode.
Easy enough to shift this to look at the next line. But specifying an explicit comment number is still fragile. (What if someone adds another comment early on?) Best to search all the comments for the first one with the key_string
match.
Fix this part and Fly scans will again be plotted. Still the code will be slow but that is an optimization, not a bug.
Now the fly scan plots appear on the livedata page.
Not getting any livedata plots for Fly scans from SPEC. 240% CPU time used by process that generates the plot.
Related to:
That one is started by this cron task