noraeisner / LATTE

Lightcurve Analysis Tool for Transiting Exoplanet
GNU Lesser General Public License v3.0
39 stars 10 forks source link

Better error handling for downloading data file (TOI_list.txt, etc.) #37

Open orionlee opened 3 years ago

orionlee commented 3 years ago

I come across a case where the downloaded data/TOI_list.txt is corrupted. Instead of a CSV, it is an HTML file indicating server error.

Because of the corrupted TOI_list.txt, data validation report generation would fail with some cryptic error

  File "pandas\_libs\hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'TIC'

Reasons: The current download logic only rejects HTTP 404 as error. In my case, it is HTTP 504 Gateway Time-out.

https://github.com/noraeisner/LATTE/blob/766661509d8217e2fd025fbf91c23c92f6d410be/LATTE/LATTEutils.py#L2562-L2566

  1. if r_TOI.status_code == 404: could be more generic: if r_TOI.status_code >= 400:
  2. To be extra safe, the code could also validate the downloaded file, before overwriting the existing TOI_list.txt
  3. There are many other data download codes that have the same problem. https://github.com/noraeisner/LATTE/search?q=.status_code+%3D%3D+404

Full stack trace of the cryptic error:

Traceback (most recent call last):
  File "C:\pkg\_winNonPortables\Anaconda3\envs\LATTE-dev\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\pkg\_winNonPortables\Anaconda3\envs\LATTE-dev\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\dev\_juypter\LATTE\LATTE\__main__.py", line 542, in <module>
    utils.interact_LATTE(tic, indir, syspath, sectors_all, sectors, ra, dec, args)  # the argument of whether to shos the images or not
  File "C:\dev\_juypter\LATTE\LATTE\LATTEutils.py", line 690, in interact_LATTE
    brew.brew_LATTE(tic, indir, syspath, transit_list, simple, BLS, model, save, DV, sectors, sectors_all, alltime, allflux, allflux_err, all_md, alltimebinned, allfluxbinned, allx1, allx2, ally1, ally2, alltime12, allfbkg, start_sec, end_sec, in_sec, tessmag, teff, srad, ra, dec, args)
  File "C:\dev\_juypter\LATTE\LATTE\LATTEbrew.py", line 387, in brew_LATTE
    ldv.LATTE_DV(tic, indir, syspath, transit_list, sectors_all, target_ra, target_dec, tessmag, teff, srad, mstar, vmag, logg, plx, c_id, [0], [0], tpf_corrupt, astroquery_corrupt, FFI = False,  bls = False, model = model, mpi = args.mpi)
  File "C:\dev\_juypter\LATTE\LATTE\LATTE_DV.py", line 84, in LATTE_DV
    TOIpl = TOI_planets.loc[TOI_planets['TIC'] == float(tic)]
  File "C:\pkg\_winNonPortables\Anaconda3\envs\LATTE-dev\lib\site-packages\pandas\core\frame.py", line 2899, in __getitem__
    indexer = self.columns.get_loc(key)
  File "C:\pkg\_winNonPortables\Anaconda3\envs\LATTE-dev\lib\site-packages\pandas\core\indexes\base.py", line 2891, in get_loc
    raise KeyError(key) from err
KeyError: 'TIC'