askap-vast / vast-pipeline

This repository holds the code of the Radio Transient detection pipeline for the VAST project.
https://vast-survey.org/vast-pipeline/
MIT License
8 stars 3 forks source link

Empty catalogues cause run to fail during forced fitting #651

Closed marxide closed 2 years ago

marxide commented 2 years ago

Encountered an exception for a run that contains some empty catalogues (see below). The function vast_pipeilne.pipeline.forced_extraction.get_data_from_parquet attempts to determine the next vallid island ID number as +1 the existing maximum ID. For an empty catalogue, the lookup for the existing max ID raises an IndexError.

2022-04-01 19:47:44,279 forced_extraction INFO Starting force extraction step.
2022-04-01 20:47:37,122 runpipeline ERROR Processing error:
single positional indexer is out-of-bounds
Traceback (most recent call last):
  File "/usr/src/vast-pipeline/vast-pipeline-dev/vast_pipeline/management/commands/runpipeline.py", line 340, in run_pipe
    pipeline.process_pipeline(p_run)
  File "/usr/src/vast-pipeline/vast-pipeline-dev/vast_pipeline/pipeline/main.py", line 274, in process_pipeline
    ) = forced_extraction(
  File "/usr/src/vast-pipeline/vast-pipeline-dev/vast_pipeline/pipeline/forced_extraction.py", line 548, in forced_extraction
    extr_df = parallel_extraction(
  File "/usr/src/vast-pipeline/vast-pipeline-dev/vast_pipeline/pipeline/forced_extraction.py", line 315, in parallel_extraction
    db.from_sequence(
  File "/usr/src/vast-pipeline/.local/lib/python3.8/site-packages/dask/base.py", line 290, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/usr/src/vast-pipeline/.local/lib/python3.8/site-packages/dask/base.py", line 573, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/usr/src/vast-pipeline/.local/lib/python3.8/site-packages/dask/multiprocessing.py", line 220, in get
    result = get_async(
  File "/usr/src/vast-pipeline/.local/lib/python3.8/site-packages/dask/local.py", line 506, in get_async
    raise_exception(exc, tb)
  File "/usr/src/vast-pipeline/.local/lib/python3.8/site-packages/dask/local.py", line 314, in reraise
    raise exc
  File "/usr/src/vast-pipeline/.local/lib/python3.8/site-packages/dask/local.py", line 219, in execute_task
    result = _execute_task(task, data)
  File "/usr/src/vast-pipeline/.local/lib/python3.8/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/usr/src/vast-pipeline/.local/lib/python3.8/site-packages/dask/bag/core.py", line 1844, in reify
    seq = list(seq)
  File "/usr/src/vast-pipeline/.local/lib/python3.8/site-packages/dask/bag/core.py", line 2032, in __next__
    return self.f(*vals)
  File "/usr/src/vast-pipeline/vast-pipeline-dev/vast_pipeline/pipeline/forced_extraction.py", line 91, in get_data_from_parquet
    prefix = df['island_id'].iloc[0].rsplit('_', maxsplit=1)[0] + '_'
  File "/usr/src/vast-pipeline/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 967, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
  File "/usr/src/vast-pipeline/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1520, in _getitem_axis
    self._validate_integer(key, axis)
  File "/usr/src/vast-pipeline/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1452, in _validate_integer
    raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
ajstewart commented 2 years ago

Interesting question of whether 'empty' images should be ingested in the first place.

I think the answer is yes? For the forced fitting and new source bits, but when I first saw this I did wonder for a minute.

marxide commented 2 years ago

Yes I think they should for the reasons you mentioned.