Open amyfromandi opened 3 months ago
These are the final results of the MariaDB migration within development!
Comparing macrostrat_temp (MariaDB) to macrostrat_temp.
Checking table counts...
SUCCESS: 0 All tables in macrostrat_temp (MariaDB) exist in macrostrat_temp.
SUCCESS: 4 macrostrat (PostgreSQL) tables succesfully copied into macrostrat_temp to retain data!
['strat_name_footprints', 'grainsize', 'pbdb_collections', 'pbdb_collections_strat_names']
Checking row counts...
SUCCESS: All row counts in all tables are the same in macrostrat_temp (MariaDB) and macrostrat_temp!
Checking column counts...
SUCCESS: Columns for 5 tables successfully copied over from macrostrat (PostgreSQL) into macrostrat_temp, to retain data!
lookup_unit_intervals 19 20 +1
col_areas 4 5 +1
cols 13 15 +2
intervals 8 9 +1
measuremeta 14 15 +1
Comparing macrostrat_temp to macrostrat (PostgreSQL).
Checking table counts...
SUCCESS: 40 macrostrat_temp tables copied over from MariaDB that do not exist in macrostrat (PostgreSQL). This confirms data retention!
[
'canada_lexicon_dump',
'interval_boundaries_scratch',
'lookup_measurements',
'minerals',
'offshore_fossils',
'pbdb_liths',
'uniquedatafiles2',
'unit_equiv',
'unit_dates',
'unit_liths_atts',
'colors',
'col_areas_6april2016',
'col_equiv',
'col_notes',
'interval_boundaries',
'measuremeta_cols',
'offshore_baggage',
'offshore_baggage_units',
'offshore_hole_ages',
'offshore_sections',
'offshore_sites',
'pbdb_intervals',
'pbdb_matches',
'rockd_features',
'ronov_sediment',
'stats',
'strat_names_lookup',
'structures',
'structure_atts',
'tectonics',
'temp_areas',
'units_datafiles',
'unit_boundaries_backup',
'unit_boundaries_scratch',
'unit_boundaries_scratch_old',
'unit_contacts',
'unit_measures_pbdb',
'unit_notes',
'unit_seq_strat',
'unit_tectonics'
]
SUCCESS: 3 macrostrat (PostgreSQL) tables did not copy into macrostrat_temp. These tables are irrelevant and do not need to be retained.
['temp_rocks', 'temp_names', 'unit_lith_atts']
Checking row counts...
SUCCESS: Row counts are greater in macrostrat_temp rather than macrostrat (PostgreSQL) for 26 tables, indicating data retention from Mariadb!
autocomplete 58599 58488 +111
col_areas 5357 5351 +6
lookup_strat_names 51229 51214 +15
lookup_units 127229 119462 +7767
lookup_unit_attrs_api 127228 119461 +7767
strat_names 51229 51210 +19
unit_strat_names 28322 28316 +6
units 127229 119508 +7721
unit_econs 3157 3148 +9
unit_environs 134617 125138 +9479
unit_liths 189907 173513 +16394
intervals 1715 1626 +89
refs 213 210 +3
cols 5654 5651 +3
col_groups 352 350 +2
col_refs 5716 5645 +71
liths 212 207 +5
lookup_unit_intervals 127229 119462 +7767
strat_names_places 50321 50288 +33
sections 12813 12713 +100
strat_tree 29467 17519 +11948
timescales 37 31 +6
timescales_intervals 2194 1970 +224
units_sections 127481 119762 +7719
unit_boundaries 135708 53799 +81909
unit_measures 105307 105048 +259
Checking column counts...
SUCCESS: Columns for 12 are greater in macrostrat_temp rather than macrostrat (PostgreSQL). This indicates data retention!
col_areas 5 4 +1
lookup_strat_names 28 22 +6
strat_names 9 5 +4
unit_strat_names 4 3 +1
units 15 13 +2
unit_environs 7 5 +2
intervals 9 8 +1
cols 15 16 -1
environs 6 5 +1
lith_atts 5 4 +1
measures 13 12 +1
sections 6 2 +4
strat_tree 6 4 +2
{'places': (0, 0)}
places 0 0 +0
SUCCESS: Notes column exists macrostrat but NOT in macrostrat_temp for units. macrostrat_temp.units.notes is its own table from Mariadb.
SUCCESS: Notes column exists macrostrat but NOT in macrostrat_temp for cols. macrostrat_temp.cols.notes is its own table from Mariadb.
SUCCESS: All columns in macrostrat exist in macrostrat_temp for these tables:
['col_refs', 'lookup_unit_attrs_api', 'lookup_unit_intervals', 'strat_names_meta', 'sections', 'unit_econs', 'lookup_strat_names', 'measures', 'projects', 'timescales', 'strat_tree', 'refs', 'unit_liths', 'lookup_units', 'measurements', 'units', 'autocomplete', 'col_areas', 'unit_strat_names', 'unit_environs', 'cols', 'intervals', 'lith_atts', 'timescales_intervals', 'unit_boundaries', 'econs', 'environs', 'units_sections', 'unit_measures', 'strat_names', 'lookup_unit_liths', 'liths', 'concepts_places', 'strat_names_places', 'col_groups', 'measuremeta', 'places']
check-data completed!
Following up from Issue #60, we migrated the MariaDB data into a new PostgreSQL
macrostrat_temp
database. Next, we counted and compared rows betweenmacrostrat_temp
and the already existentmacrostrat
database in PostgreSQL. Below is the summary of variances as well as their solutions. Our goal is to retain and not delete any data across bothmacrostrat
(PostgreSQL) andmacrostrat_temp
(MariaDB) db's.1. macrostrat tables not found in macrostrat_temp
['strat_name_footprints', 'grainsize', 'pbdb_collections', 'pbdb_collections_strat_names', 'temp_rocks', 'temp_names', 'unit_lith_atts']
2. macrostrat columns not found in macrostrat_temp
table: {'column_name'}: lookup_unit_intervals: {'best_interval_id'}, strat_tree: {'child', 'parent'}, units: {'notes'}, col_areas: {'wkt'}, cols: {'poly_geom', 'wkt', 'notes'}, intervals: {'rank'}, measuremeta: {'geometry'}
lookup_unit_intervals: {'best_interval_id'}
strat_tree: {'child', 'parent'}
this_name
column is renamed toparent
andthat_name
column tochild
respectively.units: {'notes'}
col_areas: {'wkt'}
cols: {'poly_geom', 'wkt', 'notes'}
intervals: {'rank'}
measuremeta: {'geometry'}