jarad / FluSight

An R package containing functions used in the CDC Flu Forecasting competition
GNU General Public License v3.0
12 stars 7 forks source link

Read_entry set one decimal for bin_start_incl, bin_end_notincl #18

Open craigjmcgowan opened 7 years ago

craigjmcgowan commented 7 years ago

I'm using read_entry to pull in all of the entries received so far, and for some reason some CSVs are reading in where all numbers have one decimal (i.e. 0.0, 1.0) while others, including the full_entry dataset in the package, have integer values rounded off. This is causing validation errors since the columns are character as a result of the 'none' value for Season onset.

I propose having read_entry ensure that all values have one decimal place and recreating the full_entry and minimal_entry datasets accordingly. I don't think this should have any negative effects on later functions that coerce these values to numeric.

jarad commented 7 years ago

I don't quite understand. Is the difference for entries that have a "none" option vs those that don't?

craigjmcgowan commented 7 years ago

For example, using the 1.0% to 1.1% bin in Season peak percentage, some teams have a value for bin_start_incl of 1.0 in their CSV, and some have a value of 1.

Right now read.csv is reading those into the data.frame as typed, so they're not identical character strings even though they're numerically equal. I want to add a line to read entry so that all numeric values have one decimal so they match both as characters and numbers.