bccp / nbodykit

Analysis kit for large-scale structure datasets, the massively parallel way
http://nbodykit.rtfd.io
GNU General Public License v3.0
111 stars 60 forks source link

Cannot read multiple FITS files #606

Closed mehdirezaie closed 4 years ago

mehdirezaie commented 4 years ago

Hi!

I cannot figure out why I get the IndexError: index 0 is out of bounds for axis 0 with size 0 error. I do not why it seems that what FITSCatalog returns has zero length. But I wrote a script that reproduces the error when I try to read multiple files. The full traceback is given in the following.

# run with mpirun -np 2 error_fitscatalog.py
import nbodykit.lab as nb
from nbodykit import CurrentMPIComm
comm = CurrentMPIComm.get()
rank = comm.rank
size = comm.size

if rank == 0:
    from argparse import ArgumentParser
    ap = ArgumentParser(description='FITCATALOG')
    ap.add_argument('--randoms', nargs='*', type=str, 
                    default=['/B/Shared/Shadab/FA_LSS/FA_EZmock_desi_ELG_v0_rand_00.fits',
                            '/B/Shared/Shadab/FA_LSS/FA_EZmock_desi_ELG_v0_rand_01.fits'])
    ns = ap.parse_args()
else:
    ns = None

ns      = comm.bcast(ns, root=0)

# read
randoms = nb.FITSCatalog(ns.randoms)

if rank == 0:
    print(randoms.columns, randoms.size, randoms.csize)
    print(randoms)

valid  = randoms['RA'] > 10
randoms = randoms[valid]  # where it breaks down
(py3p6) mehdi@lakme:~/github> mpirun -np 2 python error_fitscatalog.py
Traceback (most recent call last):
  File "error_fitscatalog.py", line 35, in <module>
    randoms = randoms[valid]
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/nbodykit/base/catalog.py", line 369, in __getitem__
    return self._get_slice(sel)
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/nbodykit/base/catalog.py", line 304, in _get_slice
    size = index.sum().compute()
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/dask/base.py", line 165, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/dask/base.py", line 436, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/dask/local.py", line 527, in get_sync
    return get_async(apply_sync, 1, dsk, keys, **kwargs)
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/dask/local.py", line 471, in get_async
    fire_task()
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/dask/local.py", line 466, in fire_task
    callback=queue.put,
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/dask/local.py", line 516, in apply_sync
    res = func(*args, **kwds)
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/dask/local.py", line 227, in execute_task
    result = pack_exception(e, dumps)
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/dask/local.py", line 222, in execute_task
    result = _execute_task(task, data)
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/dask/core.py", line 119, in _execute_task
    return func(*args2)
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/dask/array/core.py", line 104, in getter
    c = a[b]
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/nbodykit/io/base.py", line 245, in __getitem__
    toret = memown.read(self.keys(), start, stop, step)
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/nbodykit/io/stack.py", line 109, in read
    toret.append(self.files[fnum].read(columns, sl[0], sl[1], step=1))
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/nbodykit/io/fits.py", line 88, in read
    return fitsio.read(self.path, **kws)
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/fitsio/fitslib.py", line 112, in read
    data = fits[item].read(**keys)
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/fitsio/hdu/table.py", line 614, in read
    data = self.read_columns(columns, **keys)
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/fitsio/hdu/table.py", line 816, in read_columns
    rows = self._extract_rows(rows)
  File "/home/mehdi/miniconda3/envs/py3p6/lib/python3.7/site-packages/fitsio/hdu/table.py", line 1216, in _extract_rows
    if rows[0] < 0 or rows[-1] > maxrow:
IndexError: index 0 is out of bounds for axis 0 with size 0
^C[mpiexec@lakme] Sending Ctrl-C to processes as requested
[mpiexec@lakme] Press Ctrl-C again to force abort
['DEC', 'DZ_RSD', 'RA', 'Selection', 'Value', 'Weight', 'Z_COSMO'] 34000000 68000000
FITSCatalog(size=34000000, FileStack(FITSFile(path=/B/Shared/Shadab/FA_LSS/FA_EZmock_desi_ELG_v0_rand_00.fits, dataset=None, ncolumns=4, shape=(34000000,)>, ... 2 files))
rainwoodman commented 4 years ago

Yes. This is a bit strange I'd be curious to see if valid.sum() on rank 1 is zero. The fact that your program has hung suggests the error only occurred in one rank.

For example if the file is sorted by RA, this could happen? Still it seems to be a bug that fitsio crashes reading an empty list of rows.

mehdirezaie commented 4 years ago

Thank you, Yu! No, the catalogs were not sorted. But your pull request helped! :)

rainwoodman commented 4 years ago

I assume you have a workaround for now till they merge and push out a new version of fitsio?

mehdirezaie commented 4 years ago

Yes. Thanks!

mehdirezaie commented 4 years ago

I just helped one of the students at OU to install nbodykit, and we realized that the conda installation of Nbodykit comes with fitsio 1.0.5 which lacks the fix that was included in 1.0.6 (https://github.com/esheldon/fitsio/commit/198d9a6963cfe072164ef44556ba5629327e32b8). Could you please update the conda bindings so that they work with the newest version of fitiso?

rainwoodman commented 4 years ago

I think cfitsio has updated beyond 1.0.6 -- is this issue no longer bugging you?

mehdirezaie commented 4 years ago

No it is not bugging me. I am going to close it. Thank you very much, Yu! :)