gully / k2-metadata

all metadata for the K2 mission of the Kepler Space Telescope
MIT License
1 stars 0 forks source link

Fate of this project #1

Open gully opened 6 years ago

gully commented 6 years ago

I had a good chat with @barentsen today about the fate of this project.

We want to combine tutorial content all into one place, probably @christinahedges 's "K2torials" notebooks.

We currently have 3 main, large, tables to combine:

  1. EPIC Catalog (the 7 GB parent catalog from which most targets are selected). Contains copious metadata, fluxes and errors, etc.
  2. GO proposals All targets that were proposed, and their associated "Investigation IDs" (e.g. GO04033. May contain targets not in the EPIC Catalog, which appear instead in the "Custom Aperture File (CAF)".
  3. Scraped TPF Headers aka "K2-target-index" which contains metadata from all exported TPFs. May contain many sources per TPF (coincidentally observed), or many TPFs per source (CAF file).

We also identified a few more ancillary tables:

  1. The Guest Observer Metadata Table-- contains the "Investigation IDs", PI name, CO-I's, proposal title, abstract, and eventually/ideally the ads bibcode of the proposal once the proposals are posted to ADS.
  2. The CAF file- The table relating custom aperture ID (looks like EPIC numbers but smaller, see the Huber et al paper for details) and target (e.g. solar system objects, asteroids, galaxies, etc). Some of these have real EPIC IDs, but most do not.
  3. Scraped TPF Headers for the Reprocessed Data-- Some metadata in the reprocessed data might have changed.

We acknowledged value in the "other" category of cross-matching K2 with gaia, Simbad, etc. Some of these already exist.

gully commented 6 years ago

Importantly, Geert used language such as many-to-many, and such. The operations of joining these are really RDBMS join operations, that I'll like just do in pandas and save as csv files. However, the k2-target-index already saves .db files from SQLite, so it's possible we could do something fancier. We could also support dask, parquet, feather and other formats, though we did did not discuss these topics today.