transientskp / tkp

A transients-discovery pipeline for astronomical image-based surveys
http://docs.transientskp.org/
BSD 2-Clause "Simplified" License
19 stars 15 forks source link

Bucket cache error, #531

Closed mkuiack closed 8 years ago

mkuiack commented 8 years ago

Running in Batch mode, in pipeline.cfg:

28  [parallelise]
29  method = "multiproc"  ; or serial
30  cores = 0  ; the number of cores to use. Set to 0 for auto detect

yields

15:32:12 INFO tkp.main: dataset batch1/ contains 3526 images
15:32:12 INFO root: storing copies in image cache is disabled
Traceback (most recent call last):
  File "/scratch/mkuiack/trapvenv/bin/trap-manage.py", line 6, in <module>
    exec(compile(open(__file__).read(), __file__, 'exec'))
  File "/scratch/mkuiack/tkp/tkp/bin/trap-manage.py", line 10, in <module>
    tkp.management.main()
  File "/scratch/mkuiack/tkp/tkp/management.py", line 329, in main
    args.func(args)
  File "/scratch/mkuiack/tkp/tkp/management.py", line 227, in run_job
    run(args.name, monitor_coords)
  File "/scratch/mkuiack/tkp/tkp/main.py", line 393, in run
    run_batch(job_name, job_dir, pipe_config, job_config, runner, dataset_id)
  File "/scratch/mkuiack/tkp/tkp/main.py", line 366, in run_batch
    sorting_metadata = get_metadata_for_sorting(runner, image_paths)
  File "/scratch/mkuiack/tkp/tkp/main.py", line 269, in get_metadata_for_sorting
    nested_img)]
  File "/scratch/mkuiack/tkp/tkp/distribute/__init__.py", line 42, in map
    return self.module.map(func, iterable, args)
  File "/scratch/mkuiack/tkp/tkp/distribute/multiproc/__init__.py", line 42, in map
    return pool.map_async(func, zipped).get(9999999)
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 558, in get
    raise self._value
RuntimeError: BucketCache::getBucket: bucket 29 exceeds nr of buckets 
mkuiack commented 8 years ago

Resetting cores to 20 or 10 or 1 produced the same result.

mkuiack commented 8 years ago

as did resetting

method = 'serial'
16:27:57 INFO tkp.main: dataset batch1/ contains 3526 images
16:27:57 INFO root: storing copies in image cache is disabled
Traceback (most recent call last):
  File "/scratch/mkuiack/trapvenv/bin/trap-manage.py", line 6, in <module>
    exec(compile(open(__file__).read(), __file__, 'exec'))
  File "/scratch/mkuiack/tkp/tkp/bin/trap-manage.py", line 10, in <module>
    tkp.management.main()
  File "/scratch/mkuiack/tkp/tkp/management.py", line 329, in main
    args.func(args)
  File "/scratch/mkuiack/tkp/tkp/management.py", line 227, in run_job
    run(args.name, monitor_coords)
  File "/scratch/mkuiack/tkp/tkp/main.py", line 393, in run
    run_batch(job_name, job_dir, pipe_config, job_config, runner, dataset_id)
  File "/scratch/mkuiack/tkp/tkp/main.py", line 366, in run_batch
    sorting_metadata = get_metadata_for_sorting(runner, image_paths)
  File "/scratch/mkuiack/tkp/tkp/main.py", line 269, in get_metadata_for_sorting
    nested_img)]
  File "/scratch/mkuiack/tkp/tkp/distribute/__init__.py", line 42, in map
    return self.module.map(func, iterable, args)
  File "/scratch/mkuiack/tkp/tkp/distribute/serial/__init__.py", line 3, in map
    x = [func(i, *arguments) for i in iterable]
  File "/scratch/mkuiack/tkp/tkp/distribute/serial/tasks.py", line 45, in get_metadata_for_ordering
    for a in tkp.steps.persistence.get_accessors(images):
  File "/scratch/mkuiack/tkp/tkp/steps/persistence.py", line 160, in get_accessors
    accessor = tkp.accessors.open(image)
  File "/scratch/mkuiack/tkp/tkp/accessors/__init__.py", line 75, in open
    return Accessor(path, *args, **kwargs)
  File "/scratch/mkuiack/tkp/tkp/accessors/requiredatts.py", line 61, in __call__
    obj = type.__call__(cls, *args, **kwargs)
  File "/scratch/mkuiack/tkp/tkp/accessors/aartfaaccasaimage.py", line 11, in __init__
    super(AartfaacCasaImage, self).__init__(url, plane=0, beam=None)
  File "/scratch/mkuiack/tkp/tkp/accessors/casaimage.py", line 30, in __init__
    self.data = self.parse_data(table, plane)
  File "/scratch/mkuiack/tkp/tkp/accessors/casaimage.py", line 45, in parse_data
    data = table[0]['map'].squeeze()
  File "/scratch/mkuiack/trapvenv/local/lib/python2.7/site-packages/casacore/tables/table.py", line 393, in __getitem__
    return self._row._getitem (key, self.nrows());
  File "/scratch/mkuiack/trapvenv/local/lib/python2.7/site-packages/casacore/tables/tablerow.py", line 67, in _getitem
    return self.get (rownr);
  File "/scratch/mkuiack/trapvenv/local/lib/python2.7/site-packages/casacore/tables/tablerow.py", line 48, in get
    return self._get (rownr)
RuntimeError: BucketCache::getBucket: bucket 29 exceeds nr of buckets
gijzelaerr commented 8 years ago

so this is actually also a casacore error.

gijzelaerr commented 8 years ago

Ok, this is also caused by the faulty image. I replaced the bad image in the dataset with the correct one, for me this works. Closing issue, please reopen if you think otherwise.

$ from casacore.images import image
$ x = image('S301_R0-62_T02-06-2016_21-27-41.image')
$ x.getdata()
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-5-8afe7a814c4c> in <module>()
----> 1 x.getdata()

/home/gijs/Work/tkp/.virtualenv/local/lib/python2.7/site-packages/casacore/images/image.pyc in getdata(self, blc, trc, inc)
    296         return self._getdata (self._adjustBlc(blc),
    297                               self._adjustTrc(trc),
--> 298                               self._adjustInc(inc));
    299 
    300     # Negate the mask; in numpy True means invalid.

RuntimeError: BucketCache::getBucket: bucket 29 exceeds nr of buckets
o-smirnov commented 4 years ago

Aha, googling this mysterious error message (https://github.com/caracal-pipeline/caracal/issues/1232) turns up some familiar names!

Any idea what all of this means?

gijzelaerr commented 4 years ago

hehe, walk on memory lane. But I have no idea, I think it was a malformed image, but I can't remember what the mutilation was. Probably best to take it to the casacore tracker.