LSSTDESC / DC2-production

Configuration, production, validation specifications and tools for the DC2 Data Set.
BSD 3-Clause "New" or "Revised" License
11 stars 7 forks source link

Create PostgreSQL tables and parquet files for Run3.1i truth #423

Closed JoanneBogart closed 3 years ago

JoanneBogart commented 3 years ago

Truth tables in the form of sqlite files have been created for agn, lensed agn, lensed sne, and lensed host galaxies. They should be ingested into PostgreSQL and also translated to parquet.

JoanneBogart commented 3 years ago

For PostgreSQL the steps involved for each table are

  1. Extract schema information from sqlite; create comparable table in the PostgreSQL database. Datatype names differ. Also tables with ra,dec columns (truth summary, one for each object type) should have an additional column of type earth, typically called coord, which may be indexed and used for spatial searches.
  2. Extract data to a csv file
  3. Ingest data from csv into the PostgreSQL table
  4. If there is a coord column, compute values from ra,dec and update the database table
  5. Create indexes as needed

For parquet, use a python script previously used for other truth tables. It may need some tweaking.

Finally, validate: check that information has been preserved. For example, one might

JoanneBogart commented 3 years ago

All Run3.1i truth tables are in PostgreSQL:

Counts look ok.

JoanneBogart commented 3 years ago

Parquet files have been created.