sot / kadi

Chandra commands and events
https://sot.github.io/kadi
BSD 3-Clause "New" or "Revised" License
5 stars 3 forks source link

Handle cmds.h5 resource unavailable error #196

Closed taldcroft closed 3 years ago

taldcroft commented 3 years ago

Description

Fix the problem seen in shiny testing from Replan Central by trying up to 3 times to read the cmds.h5 file.

    File "tables/hdf5extension.pyx", line 492, in tables.hdf5extension.File._g_new
tables.exceptions.HDF5ExtError: HDF5 error back trace

     ...

    File "H5FDsec2.c", line 941, in H5FD_sec2_lock
    unable to lock file, errno = 11, error message = 'Resource temporarily unavailable'

@jeanconn also remembered a plan B of copying the cmds.* files to a temp dir for updating as done with events in #124.

Testing

Functional testing

I updated the code to force a bad filename since I don't know how to reproducibly generate the resource unavailable error.

In [1]: from kadi.commands import get_cmds                                                                                      
In [2]: cmds = get_cmds('2020:100')                                                                                             

WARNING: load_idx_cmds() exception: ``/Users/aldcroft/ska/data/kadi/cmds.h5a`` does not exist, retrying in 0.5 seconds...
WARNING: load_idx_cmds() exception: ``/Users/aldcroft/ska/data/kadi/cmds.h5a`` does not exist, retrying in 2.0 seconds...
WARNING: load_idx_cmds() exception: ``/Users/aldcroft/ska/data/kadi/cmds.h5a`` does not exist, retrying in 8.0 seconds...
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~/git/kadi/kadi/commands/commands.py in __getattribute__(self, name)
     22         try:
---> 23             val = object.__getattribute__(self, '_val')
     24         except AttributeError:

AttributeError: 'LazyVal' object has no attribute '_val'

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
~/git/kadi/go.py in <module>
      5 
      6 from kadi.commands import get_cmds
----> 7 cmds = get_cmds('2020:300')

~/git/kadi/kadi/commands/commands.py in get_cmds(start, stop, inclusive_stop, **kwargs)
    111     :returns: :class:`~kadi.commands.commands.CommandTable` of commands
    112     """
--> 113     cmds = _find(start, stop, inclusive_stop, **kwargs)
    114     out = CommandTable(cmds)
    115     out['params'] = None if len(out) > 0 else Column([], dtype=object)

~/git/kadi/kadi/commands/commands.py in _find(start, stop, inclusive_stop, **kwargs)
    218     :returns: astropy Table of commands
    219     """
--> 220     ok = np.ones(len(idx_cmds), dtype=bool)
    221     par_ok = np.zeros(len(idx_cmds), dtype=bool)
    222 

~/git/kadi/kadi/commands/commands.py in __len__(self)
     41 
     42     def __len__(self):
---> 43         return self._val.__len__()
     44 
     45 

~/git/kadi/kadi/commands/commands.py in __getattribute__(self, name)
     23             val = object.__getattribute__(self, '_val')
     24         except AttributeError:
---> 25             val = object.__getattribute__(self, '_load_func')()
     26             self._val = val
     27 

~/git/ska_helpers/ska_helpers/retry/api.py in wrapper(*args, **kwargs)
     77         @functools.wraps(f)
     78         def wrapper(*args, **kwargs):
---> 79             return __retry_internal(f, exceptions, tries, delay, max_delay,
     80                                     backoff, jitter, logger, args=args, kwargs=kwargs)
     81         return wrapper

~/git/ska_helpers/ska_helpers/retry/api.py in __retry_internal(f, exceptions, tries, delay, max_delay, backoff, jitter, logger, args, kwargs)
     30     while _tries:
     31         try:
---> 32             return f(*args, **kwargs)
     33         except exceptions as e:
     34             _tries -= 1

~/git/kadi/kadi/commands/commands.py in load_idx_cmds()
     56     unable to lock file, errno = 11, error message = 'Resource temporarily unavailable'
     57     """
---> 58     with tables.open_file(IDX_CMDS_PATH() + 'a', mode='r') as h5:
     59         idx_cmds = Table(h5.root.data[:])
     60 

~/miniconda3/envs/ska3-shiny/lib/python3.8/site-packages/tables/file.py in open_file(filename, mode, title, root_uep, filters, **kwargs)
    313 
    314     # Finally, create the File instance, and return it
--> 315     return File(filename, mode, title, root_uep, filters, **kwargs)
    316 
    317 

~/miniconda3/envs/ska3-shiny/lib/python3.8/site-packages/tables/file.py in __init__(self, filename, mode, title, root_uep, filters, **kwargs)
    776 
    777         # Now, it is time to initialize the File extension
--> 778         self._g_new(filename, mode, **params)
    779 
    780         # Check filters and set PyTables format version for new files.

tables/hdf5extension.pyx in tables.hdf5extension.File._g_new()

~/miniconda3/envs/ska3-shiny/lib/python3.8/site-packages/tables/utils.py in check_file_access(filename, mode)
    152         # The file should be readable.
    153         if not os.access(filename, os.F_OK):
--> 154             raise IOError("``%s`` does not exist" % (filename,))
    155         if not os.path.isfile(filename):
    156             raise IOError("``%s`` is not a regular file" % (filename,))

OSError: ``/Users/aldcroft/ska/data/kadi/cmds.h5a`` does not exist
javierggt commented 3 years ago

I side note on this:

I always find confusing when I see these exception chains, although this case is simple because the first exception seems to be short. The first exception does not add any information, so I would write it like:

        try:
            val = object.__getattribute__(self, '_val')
        except AttributeError:
            try:
                val = object.__getattribute__(self, '_load_func')()
                self._val = val
            except Exception as e:
                raise e from None

That clears the first exception and only shows the second.

javierggt commented 3 years ago

but I guess it's a matter of taste

taldcroft commented 3 years ago

@javierggt - your comment is fair enough but OBE since I moved to using a new module in ska_helpers for retrying. I'll admit that the stack trace is probably even more confusing now since there is also a decorator in the mix, but I've already spent too much time on this and it does seem to work. I'm going to do the same retry for the update code (in a separate PR).