Closed acviana closed 2 years ago
Let's start by looking at the schema. What need to be removed? What needs to be added? As we discussed last week:
Here is the current schema:
$ describe master_images;
+------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| project_id | int(11) | YES | | NULL | |
| name | varchar(50) | YES | UNI | NULL | |
| fits_file | varchar(50) | YES | UNI | NULL | |
| object_name | varchar(50) | YES | | NULL | |
| set_id | int(11) | YES | | NULL | |
| set_index | int(11) | YES | | NULL | |
| width | int(11) | YES | | NULL | |
| height | int(11) | YES | | NULL | |
| minimum_ra | double | YES | | NULL | |
| minimum_dec | double | YES | | NULL | |
| maximum_ra | double | YES | | NULL | |
| maximum_dec | double | YES | | NULL | |
| pixel_resolution | float | YES | | NULL | |
| priority | int(11) | YES | | NULL | |
| description | varchar(50) | YES | | NULL | |
| file_location | varchar(100) | YES | | NULL | |
| visit | varchar(11) | YES | | NULL | |
| orbit | int(11) | YES | | NULL | |
| drz_mode | varchar(6) | YES | | NULL | |
| cr_mode | varchar(6) | YES | | NULL | |
+------------------+--------------+------+-----+---------+----------------+
21 rows in set (0.00 sec)
I think we should drop all the CQ specific fields. They don't anything to the MAST product. There are only 3 of these:
set_id
set_index
priority
There are a couple of others that are just kinda useless and we should drop while we're cleaning house anyway:
description
file_location
should be deleted because the location changes once we deliver the files so this is immediately irrelevant.Then there are some things that should be renamed and changed slightly:
-object_name
should be changed to jpl_object
and refer to the name of the object in the HORIZON system.
fits_file
should be changes to rootname
and refer to the 9-digit rootname for each record.Then there is bunch of stuff to add:
version
should be the value of the __version__
variable we're going to add to the mtpipeline
package.flt.fits
or c0m.fits
file.Changes to master_images
:
Dropped:
set_id
set_index
priority
description
I decided to leave file_location
because this column is used in build_master_finders_table.py
. I need to find a solution to find the file location before deleting this column.
Renamed object_name
as jpl_object
and fits_file
as root name
.
Added version
. What type should this column be?
I didn't understand this part:
Should 3 (e.g. fits_type
, png_type
and ds9_region
) columns be created?
Here are the suggested names for the FITS and PNG files. There are two sets let me know which one you prefer:
The base flt.fits or c0m.fits file.
From these last two we get 4 PNG files that need field names:
These are the ones I prefer:
But what columns show I add to master_images
to store these fields?
Here is the current schema for master_images
:
+------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| project_id | int(11) | YES | | NULL | |
| name | varchar(50) | YES | UNI | NULL | |
| rootname | varchar(50) | YES | UNI | NULL | |
| jpl_object | varchar(50) | YES | | NULL | |
| width | int(11) | YES | | NULL | |
| height | int(11) | YES | | NULL | |
| minimum_ra | double | YES | | NULL | |
| minimum_dec | double | YES | | NULL | |
| maximum_ra | double | YES | | NULL | |
| maximum_dec | double | YES | | NULL | |
| pixel_resolution | float | YES | | NULL | |
| file_location | varchar(100) | YES | | NULL | |
| visit | varchar(11) | YES | | NULL | |
| orbit | int(11) | YES | | NULL | |
| drz_mode | varchar(6) | YES | | NULL | |
| cr_mode | varchar(6) | YES | | NULL | |
| version | varchar(11) | YES | | NULL | |
+------------------+--------------+------+-----+---------+----------------+
+------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| project_id | int(11) | YES | | NULL | |
| name | varchar(50) | YES | UNI | NULL | |
| rootname | varchar(50) | YES | UNI | NULL | |
| jpl_object | varchar(50) | YES | | NULL | |
| width | int(11) | YES | | NULL | |
| height | int(11) | YES | | NULL | |
| minimum_ra | double | YES | | NULL | |
| minimum_dec | double | YES | | NULL | |
| maximum_ra | double | YES | | NULL | |
| maximum_dec | double | YES | | NULL | |
| pixel_resolution | float | YES | | NULL | |
| file_location | varchar(100) | YES | | NULL | |
| visit | varchar(11) | YES | | NULL | |
| orbit | int(11) | YES | | NULL | |
| fits_drz_mode | varchar(6) | YES | | NULL | |
| fits_cr_mode | varchar(6) | YES | | NULL | |
| png_drz_mode | varchar(6) | YES | | NULL | |
| png_cr_mode | varchar(6) | YES | | NULL | |
| ds | varchar(6) | YES | | NULL | |
| version | varchar(11) | YES | | NULL | |
+------------------+--------------+------+-----+---------+----------------+
Should the new schema be like this?
Here are the new fields that need to be added. They should all be sufficiently long varchar
fields to allow the full file name without the path. They should all also have uniqueness constraints.
file name | field name |
---|---|
rootname | rootname |
rootname_flt.fits | flt |
rootname_cr_flt.fits | cr_flt |
rootname_single_sci.fits | single_sci |
rootname_cr_single_sci.fits | cr_single_sci |
rootname_single_sci_linear.png | single_sci_linear |
rootname_cr_single_sci_linear.png | cr_single_sci_linear |
rootname_single_sci_log.png | single_sci_log |
rootname_cr_single_sci_log.png | cr_single_sci_log |
Ok. I added these fields in master_images
, I only need to add this uniqueness constraint in their properties.
I used these commands to add them:
alter table master_images add flt varchar(50);
alter table master_images add cr_flt varchar(50);
alter table master_images add single_sci varchar(50);
alter table master_images add cr_single_sci varchar(50);
alter table master_images add single_sci_linear varchar(50);
alter table master_images add cr_single_sci_linear varchar(50);
alter table master_images add single_sci_log varchar(50);
alter table master_images add cr_single_sci_log varchar(50);
I added a name
column in master_finders
and populated the rows with the following command:
mysql> update master_finders, master_images set master_finders.name = master_images.name where master_finders.master_images_id = master_images.id;
Query OK, 727863 rows affected (11.16 sec)
Rows matched: 727863 Changed: 727863 Warnings: 0
Now I will start to rebuild the master_images
table.
Before I made those changes I created a dump mtdump-2014-08-13.sql
and copied to /astro/mtpipeline/database_dumps/
.
I also dropped visit
, orbit
, cr_mode
and drz_mode
columns from master_images
. And changed project_id
to proposalid
. This is the current schema for the table:
+----------------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| proposalid | int(11) | YES | | NULL | |
| name | varchar(50) | YES | UNI | NULL | |
| rootname | varchar(50) | YES | UNI | NULL | |
| jpl_object | varchar(50) | YES | | NULL | |
| width | int(11) | YES | | NULL | |
| height | int(11) | YES | | NULL | |
| minimum_ra | double | YES | | NULL | |
| minimum_dec | double | YES | | NULL | |
| maximum_ra | double | YES | | NULL | |
| maximum_dec | double | YES | | NULL | |
| pixel_resolution | float | YES | | NULL | |
| version | varchar(11) | YES | | NULL | |
| flt | varchar(50) | YES | UNI | NULL | |
| cr_flt | varchar(50) | YES | UNI | NULL | |
| single_sci | varchar(50) | YES | UNI | NULL | |
| cr_single_sci | varchar(50) | YES | UNI | NULL | |
| single_sci_linear | varchar(50) | YES | UNI | NULL | |
| cr_single_sci_linear | varchar(50) | YES | UNI | NULL | |
| single_sci_log | varchar(50) | YES | UNI | NULL | |
| cr_single_sci_log | varchar(50) | YES | UNI | NULL | |
+----------------------+-------------+------+-----+---------+----------------+
I set master_images_id
in master_finders
as nullable and updated all rows as null
so that I could rebuild master_images
.
These are the commands I used:
alter table master_finders change master_images_id master_images_id int(11) null;
update master_finders set master_images_id = NULL;
Then I cleared the master_images
table:
mysql> delete from master_images;
Query OK, 50888 rows affected (21.07 sec)
I changed the code and committed it. I tested it on /astro/mtpipeline/mtpipeline_outputs/wfpc2/11156_neptune/u9zy5503m_c0m.fits
and it inserted a row in master_images
.
I created a new db dump mtdump-2014-08-13_v2.sql
and copied it to /astro/mtpipeline/database_dumps/
.
After looking at the changes to database_interface.py, we've decided that we will build the output filenames a bit differently.
Example:
file = 'hlsp_mt_hst_wfpc2_{}_*_c0m.fits'.format(self.rootname)
if os.exist(file):
self.single_sci = file.replace('_c0m.fits', '_img.fits')
self.cr_single_sci = file.replace('_c0m.fits', '_sci.fits')
I've pulled the update-master-finders
branch and I'm currently working on these changes.
This ticket is for the work that needs to be done to the
master_images
table for the v1.0.0 release. The 3 areas we need to address are:master_images
schemamaster_images
valuesmake_master_images_table.py