UA-SRC-data / data_loaders

Data loaders
MIT License
2 stars 0 forks source link

Fix parsing of field_data.csv files in CSM data #4

Open ramonawalls opened 3 years ago

ramonawalls commented 3 years ago

There is an error in all of the $Date_field_data.csv files in /iplant/home/rwalls/ua-src-data/csm/water_chem/data

The first row looks like this:

measurement     sowm    ref1_7_02   ant_6_28    rap_6_3     rbp_6_09    arg_6_45    usgs_7_25

Where the first row should be something like

measurement     sowm    ref1    ant     rap     rbp     arg     usgs

and the second should be the values: 7.02, 6.28 …

@kyclark Can this be fixed in the parsing script, or should we fix it manually?

kyclark commented 3 years ago

I find the problem in the following files:

==> 20161028_field_data.csv <==
measurement,ref1_8_97,railless_6_57,arg_6_27

==> 20170309_field_data.csv <==
measurement,sowm,ref1_7_02,ant_6_28,rap_6_3,rbp_6_09,arg_6_45,usgs_7_25

==> 20170323_field_data.csv <==
measurement,sowm,ref_1_6_57,ref_2_6_58,ant_6_76,rap_6_99,tp_5_33,rbp_6_83,arg_6_75,usgs_7_16

==> 20170504_field_data.csv <==
measurement,sowm,ref2_7_15,ant_6_98,rap_6_82,rbp_7_11,arg__

==> 20170516_field_data.csv <==
measurement,sowm,ref_2__,rap__,tp__,rbp__,arg__,usgs__

==> 20170601_field_data.csv <==
measurement,sowm,ref2__,ant__,rap__,tp__,rbp__,arg__

==> 20170804_field_data.csv <==
measurement,sowm,ref2_7_66,ant_7_73,rap_7_65,tp_7_31,rbp_7_51,arg_7_64,usgs_7_66

==> 20170829_field_data.csv <==
measurement,sowm,ref2_7_13,ant_7_36,rap_7_37,tp_6_76,rbp_7_23,arg_7_42,usgs_7_54

==> 20170921_field_data.csv <==
measurement,sowm,ref2_7_17,ant_7_5,rap_7_3,tp_6_48,rbp_7_18,arg_6_74,usgs_7_78

==> 20171027_field_data.csv <==
measurement,sowm,ref1_5_85,ref2_6_32,ant_6_36,rap_6_5,tp_6_08,rbp_6_33,arg_6_01,usgs_6_37

==> 20180118_field_data.csv <==
measurement,sowm,ref_1_6_51,ref2_6_3,ant_6_62,rap_6_57,tp_6_53,rbp_6_73,arg_6_69,usgs_6_81

==> 20180405_field_data.csv <==
measurement,sowm,ref1_6_92,ref2_6_75,ant_6_7,rap_6_72,tp_6_5,rbp_6_57

==> 20180511_field_data.csv <==
measurement,sowm,ref_1_6_51,ref2_6_3,ant_6_62,rap_6_57,tp_6_53,rbp_6_73,arg_6_69,usgs_6_81

==> 20180827_field_data.csv <==
measurement,sowm,ref1_7_25,ref2_7_44,ant_7_73,rap_7_63,tp_7_5

==> 20181109_field_data.csv <==
measurement,sowm,ref1_7_29,ref2_7_66,ant_7_1,rap_7_17,tp_7_25
kyclark commented 3 years ago

While it's possible to fix the Python parsing, I think it will take less time to manually fix these. Should I do this?

ramonawalls commented 3 years ago

@kyclark Yes please!

Once that is fixed, can you please run the scrutinizer script on these datasets?