gregelin / FCC-CDBS

Helpful scripts for working with FCC's CDBS Download
8 stars 7 forks source link

how many warnings from mysql? #2

Closed rkiddy closed 12 years ago

rkiddy commented 12 years ago

If you add this to the beginning of your fcc_cdbs_load_mysql.sql file:

set max_error_count=65535;

and add this after every LOAD DATA:

show warnings;

How many warnings do you see? I see lots and lots of warnings.

It is a really good start, though. I was thinking of doing this and am glad I found this code. May have pull requests for improvements.

gregelin commented 12 years ago

rkiddy,

Thanks for the observation and tip. I'm seeing many errors, too. I've limited max_error_count=10 to get a more overall view of what is happening. I will take a closer look at the issue.

gregelin commented 12 years ago

Looking at first set of Warnings.

Here is the SQL statement

--------------
LOAD DATA LOCAL INFILE '/codedata/FCC/data/public/cdbs/am_ant_sys.dat'
INTO TABLE am_ant_sys
FIELDS TERMINATED BY '|'
(ant_mode, ant_sys_id, application_id, aug_count, bad_data_switch, domestic_pattern, dummy_data_switch, efficiency_restricted, efficiency_theoretical, feed_circ_other, feed_circ_type, hours_operation, lat_deg, lat_dir, lat_min, lat_sec, lon_deg, lon_dir, lon_min, lon_sec, q_factor, q_factor_custom_ind, power, rms_augmented, rms_standard, rms_theoretical, tower_count, eng_record_type, biased_lat, biased_long, mainkey, am_dom_status, lat_whole_secs, lon_whole_secs, ant_dir_ind, grandfathered_ind, specified_hrs_range, augmented_ind, @last_update_date)
set
last_update_date = str_to_date(@last_update_date, '%m/%d/%Y')
--------------

Here are the warnings

These are the errors for the first row of data.

--------------
SHOW WARNINGS
--------------

Level   Code    Message
Warning 1265    Data truncated for column 'efficiency_restricted' at row 1
Warning 1265    Data truncated for column 'efficiency_theoretical' at row 1
Warning 1265    Data truncated for column 'q_factor' at row 1
Warning 1265    Data truncated for column 'rms_standard' at row 1
Warning 1366    Incorrect integer value: '' for column 'lat_whole_secs' at row 1
Warning 1366    Incorrect integer value: '' for column 'lon_whole_secs' at row 1
Warning 1262    Row 1 was truncated; it contained more data than there were input columns

Here is the source data (with header row added)

ant_mode|ant_sys_id|application_id|aug_count|bad_data_switch|domestic_pattern|dummy_data_switch|efficiency_restricted|efficiency_theoretical|feed_circ_other|feed_circ_type|hours_operation|lat_deg|lat_dir|lat_min|lat_sec|lon_deg|lon_dir|lon_min|lon_sec|q_factor|q_factor_custom_ind|power|rms_augmented|rms_standard|rms_theoretical|tower_count|eng_record_type|biased_lat|biased_long|mainkey|am_dom_status|lat_whole_secs|lon_whole_secs|ant_dir_ind|grandfathered_ind|specified_hrs_range|augmented_ind|@last_update_date
DAD|1|153|9||A||||||D|41|N|7|56.000000|80|W|45|40.000000||Y|0.500000|221.600000||204.390000|3|C|131.132000|260.761000|BDF52AC0009B3518|L|||Y||||02/12/1998|^|

Observations

Possible fixes

gregelin commented 12 years ago

After studying issue a bit more, most of the fields producing a "data truncated" warning are floats, not varchars.

Here is one possible solution

Replace fields with variable and check variable for having no value.


-- am_ant_sys
-- load am_ant_sys
LOAD DATA LOCAL INFILE '/codedata/FCC/data/public/cdbs/am_ant_sys.dat'
INTO TABLE am_ant_sys
FIELDS TERMINATED BY '|'
(ant_mode, ant_sys_id, application_id, aug_count, bad_data_switch, domestic_pattern, dummy_data_switch, @efficiency_restricted, @efficiency_theoretical, feed_circ_other, feed_circ_type, hours_operation, lat_deg, lat_dir, lat_min, lat_sec, lon_deg, lon_dir, lon_min, lon_sec, q_factor, q_factor_custom_ind, power, rms_augmented, rms_standard, rms_theoretical, tower_count, eng_record_type, biased_lat, biased_long, mainkey, am_dom_status, lat_whole_secs, lon_whole_secs, ant_dir_ind, grandfathered_ind, specified_hrs_range, augmented_ind, @last_update_date)
set efficiency_restricted = if(@efficiency_restricted <> '',@efficiency_restricted,null),
efficiency_theoretical = if(@efficiency_theoretical <> '',@efficiency_theoretical,null),
last_update_date = str_to_date(@last_update_date, '%m/%d/%Y');
SHOW WARNINGS;
gregelin commented 12 years ago

Issue resolved. Mysql reporting no Warnings now. If warnings are reported in future, they should be examined as they represent potential changes in data structure of downloaded files.

Add variables to catch null floats, null integer values, and fixed fields on table elevation_pattern_addl to match 1/20/12 updated field descriptions.

https://github.com/gregelin/FCC-CDBS/commit/24e0c247bfde1121a98a326cdd4babb2ebd8b0e1