Open vlas-sokolov opened 6 years ago
Definitely not a file lock issue - I get up to 15 times slower load speeds for other fits files on the same machine.
The issue seems to be in pyspeckit. SpectralCube doesn't slow down as much:
In [41]: fitsfile
Out[41]: 'NGC1333_NH3_11_base_DR1.fits'
In [42]: %timeit SpectralCube.read(fitsfile)
100 loops, best of 3: 8.6 ms per loop
In [43]: %timeit pyspeckit.Cube(fitsfile)
1 loop, best of 3: 22.1 s per loop
In [44]: # now run without 30/40 CPUs running jobs
In [45]: %timeit SpectralCube.read(fitsfile)
100 loops, best of 3: 6.66 ms per loop
In [46]: %timeit pyspeckit.Cube(fitsfile)
1 loop, best of 3: 484 ms per loop
In [47]: pyspeckit.__version__
Out[47]: '0.1.21.dev2691'
Tried running artificial CPU/io load with both stress
and dd
(SO link here), can not reproduce. Looks like pyspeckit
jobs have to be running in parallel.
A relevant (? this isn't exactly how I start processes) example:
import pyspeckit
from multiprocessing import Pool
def read(fname='NGC1333_NH3_11_base_DR1.fits'):
spc = pyspeckit.Cube(fname)
# don't return
def multiread(n=6):
fname = 'NGC1333_NH3_11_base_DR1.fits'
p = Pool(n)
results = p.map(read, [fname]*n)
In [14]: %timeit read()
1 loop, best of 3: 683 ms per loop
In [15]: %timeit multiread(n=30)
1 loop, best of 3: 16.9 s per loop
Adam is telling me that spectral_cube
does lazy i/o on cubes, so the time overheads are coming into play from loading cubes into memory.
Solution? Switch to passing single spectra. Opening cubes in every child process was silly anyway.
CubeStack
instances, which have to either be stored one-by-one or together.innocent_script
s to operate on spectra only.Writing CubeStack-like spectra is possible as sp.write('foo.fits', tolerance=np.inf)
.
EDIT: this distorts the xaxis, which is forced to be uniform. The spectral data / errors are saved well enough though, the only issue is preserving the correct SpectroscopicAxis
instance. This one is uniform for an entire cube, I should consider just saving an array along with essential xarrkwargs.
Now we will store the xarr
, spc.cube
, spc.errorcube
, and spc.header
on a hard drive as a one-time function:
In [618]: %timeit save_datacube(pspc, target_dir, target_cubefile, target_errfile, target_header)
1 loop, best of 3: 2.33 s per loop
Followed by a fast-ish read function that operates on the products above:
In [619]: %timeit spsp = get_spectrum(140, 140, target_dir, target_xarr, target_xarrkwargs, target_cubefi
...: le, target_errfile)
10 loops, best of 3: 25.5 ms per loop
This definitely compares rather well to a previously used method with a time benchmark of
In [622]: %timeit spc = make_cube()
1 loop, best of 3: 9 s per loop
The speedup feels great. What now?
Scripts running on large fits files get sluggish execution speeds.
Here is the time it takes to open NGC1333 GAS DR1 data when no other process is running. Only one CPU is used on the process monitor.
And this is what happens when 30 processes (all sampling spectra) are working in the background. In principle, with 10 available cores you wouldn't expect the spectral cube initialization to be much slower, but the
%timeit
magic shows the contrary:Could it be that there's a traffic jam when reading the file?