weecology / retriever

Quickly download, clean up, and install public datasets into a database management system
http://data-retriever.org
Other
306 stars 132 forks source link

FIA #17

Closed bendmorris closed 12 years ago

bendmorris commented 12 years ago

Traceback (most recent call last): File "", line 1, in File "scripts/fia.py", line 69, in download self.engine.insert_data_from_url(file) File "/usr/local/lib/python2.7/dist-packages/retriever-1.0.0-py2.7.egg/retriever/lib/engine.py", line 552, in insert_data_from_url self.insert_data_from_file(self.format_filename(filename))
File "/usr/local/lib/python2.7/dist-packages/retriever-1.0.0-py2.7.egg/retriever/lib/engine.py", line 542, in insert_data_from_file self.add_to_table() File "/usr/local/lib/python2.7/dist-packages/retriever-1.0.0-py2.7.egg/retriever/lib/engine.py", line 65, in add_to_table for n in range(len(linevalues))] File "/usr/local/lib/python2.7/dist-packages/retriever-1.0.0-py2.7.egg/retriever/lib/engine.py", line 453, in format_insert_value return int(intvalue) ValueError: invalid literal for int() with base 10: 'CPPSSS11'

Previous line:

INSERT INTO FIA_COND (cn, plt_cn, invyr, statecd, unitcd, countycd, plot, condid, cond_status_cd, cond_nonsample_reasn_cd, reservcd, owncd, owngrpcd, forindcd, adforcd, fortypcd, fldtypcd, mapden, stdage, stdszcd, fldszcd, siteclcd, sicond, sibase, sisp, stdorgcd, stdorgsp, prop_basis, condprop_unadj, micrprop_unadj, subpprop_unadj, macrprop_unadj, slope, aspect, physclcd, gsstkcd, alstkcd, dstrbcd1, dstrbyr1, dstrbcd2, dstrbyr2, dstrbcd3, dstrbyr3, trtcd1, trtyr1, trtcd2, trtyr2, trtcd3, trtyr3, presnfcd, balive, fldage, alstk, gsstk, fortypcdcalc, habtypcd1, habtypcd1_pub_cd, habtypcd1_descr_pub_cd, habtypcd2, habtypcd2_pub_cd, habtypcd2_descr_pub_cd, mixedconfcd, vol_loc_grp, siteclcdest, sitetree_tree, sitecl_method, carbon_down_dead, carbon_litter, carbon_soil_org, carbon_standing_dead, carbon_understory_ag, carbon_understory_bg, created_by, created_record_date, created_in_instance, modified_by, modified_record_date, modified_in_instance, cycle, subcycle, soil_rooting_depth_pnw, ground_land_class_pnw, plant_stockability_factor_pnw, stnd_cond_cd_pnwrs, stnd_struc_cd_pnwrs, stump_cd_pnwrs, fire_srs, grazing_srs, harvest_type1_srs, harvest_type2_srs, harvest_type3_srs, land_use_srs, operability_srs, stand_structure_srs, nf_cond_status_cd, nf_cond_nonsample_reasn_cd, canopy_cvr_sample_method_cd, live_canopy_cvr_pct, live_missing_canopy_cvr_pct, nbr_live_stems) VALUES (3337761010690, 3337759010690, 1989, 32, 1, 3, 351, 1, 1, 0, 1, 11, 10, 0, 417, 261, 261, 1, 164, 1, 0, 6, 21, 50, 108, 0, 0, '', 1, 1, 1, 0, 67, 289, 0, 1, 1, 52, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 260.6012, 0, 0, 0, 0, '', 0, 0, 0, 0, 0, '', '', 0, 0, 0, 6.885517, 18.602616, 14.039627, 7.507694, .393915, .043768, 0, '2004-05-28', 10690, 0, '2010-07-07', 10690, 1, 0, 0, 0, 0, 0, 0, '', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);

bendmorris commented 12 years ago

The problem is in NV_COND.CSV in columns "HABTYPCD1" and "HABTYPCD2"

bendmorris commented 12 years ago

The problem: to make sure each column had the correct data types, I was creating a "prep file" for each table by writing the first 10 lines of each state's data file to a single file, and using that to create the table and infer the column data types. Habitat type codes were never listed in the first 10 lines of any data file so that column was defaulting to INTEGER. I changed the number of lines to copy from each file from 10 to 1500 to make sure that more possible values are caught. Testing now.

bendmorris commented 12 years ago

This issue is confirmed fixed.