DOI-USGS / loadflex

Models and Tools for Watershed Flux Estimates
http://dx.doi.org/10.1890/ES14-00517.1
Other
14 stars 17 forks source link

"Error in View" message when attempting to view data frames #218

Closed lsethna closed 7 years ago

lsethna commented 7 years ago

I recently started to get an error message

Error in View : 'names' attribute [3] must be the same length as the vector [1]

when I tried to view any data frame while the loadflex package was loaded. I first noticed this issue at the end of last week and it only happens if I have loadflex activated.

Do you have any idea what this means, how this happens, and/or how I could fix it?

aappling-usgs commented 7 years ago

Wow, interesting! Not something I've experienced. Can you provide a reproducible example?

lindsayplatt commented 7 years ago

It would probably also be helpful if you run these lines and share the output:

packageVersion("loadflex")
R.version
lsethna commented 7 years ago

Output from lindsaycarr's above code:

> packageVersion("loadflex")
[1] ‘1.1.11’

> R.version
               _                           
platform       x86_64-w64-mingw32          
arch           x86_64                      
os             mingw32                     
system         x86_64, mingw32             
status                                     
major          3                           
minor          4.1                         
year           2017                        
month          06                          
day            30                          
svn rev        72865                       
language       R                           
version.string R version 3.4.1 (2017-06-30)
nickname       Single Candle

Also, I've gotten the same error when trying to view a simple, random data sheet that I created in excel (10 lines of 4 different variables). I input

View(datafile)

And the output is

Error in View : 'names' attribute [3] must be the same length as the vector [1]
lindsayplatt commented 7 years ago

Hmmm well those are the same as mine. Are you using RStudio? Try RStudio.Version()$version. Just want to make sure it's not an issue with outdated software before diving into the code.

lsethna commented 7 years ago

I am using RStudio!

> RStudio.Version()$version
[1] ‘1.0.153’
lindsayplatt commented 7 years ago

Can you try using View with a built-in dataset? Does View(mtcars) throw the same error?

Another thought is that there might be some dependency that needs to be updated. If you're comfortable doing it, I'd suggest updating your packages.

lsethna commented 7 years ago

View(mtcars) didn't give me the error! Just seems to happen with datasets that I import, and I've tried importing as CSV files rather than Excel and I get the same error. I also just updated all my packages.

lindsayplatt commented 7 years ago

Can you use head() on your data frame and put the output here? This does seem more data-related and not loadflex related, but pretty strange that it only happens after you library(loadflex)

lsethna commented 7 years ago

I can use head() and get an output. I thought for a while something was wrong with my data, but I've tried uploading it in multiple formats and also other data files and I still get the error. And only if I'm using loadflex.

This wouldn't really be a huge deal except that I'm also getting errors like

>preds_SS_li <- predictSolute(SS_li, "conc", estdat, se.pred=TRUE, date=TRUE)
Error in `$<-.data.frame`(`*tmp*`, se.pred, value = 65.4649449789483) : 
  replacement has 1 row, data has 0

but I was getting an output from this same line of code last week. I was just trying to reproduce my same result and started getting these error messages. I can post the entire code and output if that helps add some context.

lindsayplatt commented 7 years ago

What @aappling-usgs meant by "reproducible" example was something we could copy & paste into our environment and produce the same error. I am thinking this is data-related since you were able to perform View(mtcars) without an issue. So, it might be the data you are trying to use. If you haven't already, use some data.frame diagnostic functions to make sure your data was loaded into R as you expected. You can try posting the output here if you don't see anything obvious right away and we might be able to tease out the issue.

head(dataFile)
names(dataFile)
dim(dataFile)
nrow(dataFile)
ncol(dataFile)
summary(dataFile)
lsethna commented 7 years ago

Something reproducible? Input

library(readxl)
Colors <- read_excel("~/Colors.xlsx")
library(loadflex)
View(Colors)

Output

> library(readxl)

> Colors <- read_excel("~/Colors.xlsx")

> library(loadflex)
[standard output]

> View(Colors)
Error in View : 'names' attribute [2] must be the same length as the vector [1]

I just created the "Colors" file in excel, I am attaching it here. But I will get the same output with any data file that I input to excel. Colors.xlsx

Here is a code I edited to fit the dataset "Maumee Chem". I used @aappling-usgs's code from the Lamprey River example and plugged in my own variables.

# Interpolation data
intdat <- MaumeeChem[c("date","FlowCFS","SS_mgL")]

# Calibration data: Restrict to points separated by sufficient time
regdat <- subset(MaumeeChem)[c("date", "FlowCFS","SS_mgL")]

# Estimation data
estdat <- subset(MaumeeChem, date < as.POSIXct("1975-1-10 00:00:00"))
estdat <- estdat[seq(1, nrow(estdat))]

#Create metadata description of the dataset and desired output
meta <- metadata(constituent="SS_mgL", flow="FlowCFS", 
                 dates="date", conc.units="mg L^-1", flow.units="cfs", load.units="kg", 
                 load.rate.units="kg d^-1", site.name="Maumee River",
                 consti.name="Suspended Solids")

#Fit models: interpolation
SS_li <- loadInterp(interp.format="conc", interp.fun=rectangularInterpolation, 
                     data=intdat, metadata=meta)

#Inspect models
getMetadata(SS_li)

#Generate point predictions from each model
preds_li <- predictSolute(SS_li, "conc", estdat, se.pred=TRUE, date=TRUE)

#Different ways to inspect models
summary(getFittedModel(SS_li))

#Aggregate from point predictions to monthly predictions from each model. 
#You can also do this for mean concentration or total flux for the month, or for years or other time intervals.
aggs_li <- aggregateSolute(preds_li, meta, "flux rate", "water year")

And here is the output which makes me think that the issue with View() might be affecting the results

> # Interpolation data
> intdat <- MaumeeChem[c("date","FlowCFS","SS_mgL")]

> # Calibration data: Restrict to points separated by sufficient time
> regdat <- subset(MaumeeChem)[c("date", "FlowCFS","SS_mgL")]

> # Estimation data
> estdat <- subset(MaumeeChem, date < as.POSIXct("1975-1-10 00:00:00"))

> #Create metadata description of the dataset and desired output
> meta <- metadata(constituent="SS_mgL", flow="FlowCFS", 
+                  dates="d ..." ... [TRUNCATED] 

> #Fit models: interpolation
> SS_li <- loadInterp(interp.format="conc", interp.fun=rectangularInterpolation, 
+                      data=intdat, met .... [TRUNCATED] 

> #Inspect models
> getMetadata(SS_li)
Metadata for a load model
-NAME-       -VALUE-
constituent  SS_mgL
consti.name  Suspended Solids
flow         FlowCFS
load.rate    
dates        date
conc.units   mg L^-1
flow.units   ft^3 s^-1
load.units   kg
load.rate.units kg d^-1
station      
site.name    Maumee River
site.id      
lat          NA
lon          NA
basin.area   NA
flow.site.name 
flow.site.id 
flow.lat     NA
flow.lon     NA
flow.basin.area NA
basin.area.units km^2
> 
> 
> #Generate point predictions from each model
> preds_li <- predictSolute(SS_li, "conc", estdat, se.pred=TRUE, date=TRUE)
Error in `$<-.data.frame`(`*tmp*`, se.pred, value = 65.4649449789483) : 
  replacement has 1 row, data has 0
> 
> #Different ways to inspect models
> summary(getFittedModel(SS_li))
     Length       Class        Mode 
          1 interpModel   character 
> 
> #Aggregate from point predictions to monthly predictions from each model. 
> #You can also do this for mean concentration or total flux for the month, or for years or other time intervals.
> aggs_li <- aggregateSolute(preds_li, meta, "flux rate", "water year")
Error in is.data.frame(preds) : object 'preds_li' not found
In addition: Warning message:
In aggregateSolute(preds_li, meta, "flux rate", "water year") :
  Shoot, we've discovered a big problem in aggregateSolute. The Values are fine, but the uncertainty estimates (SE, CI_lower, CI_upper) are too low by a factor of 3 to 10 or more. We'll be working on this over the coming year (it's not a trivial challenge). In the meantime, please consider reporting instantaneous uncertainties only, or using predLoad(getFittedModel(load.model), by=[format]) if you need aggregated uncertainties from a loadReg2 model. Sorry about this!
> View(MaumeeChem)
Error in View : 'names' attribute [3] must be the same length as the vector [1]

I am attaching a truncated version of the Maumee Chem file since the original is over 18,000 lines. MaumeeChem truncated.xls.xlsx

I got an output for all of the data.frame diagnostic functions, just the same error for View(dataFile)

aappling-usgs commented 7 years ago

what happens if you replace View(MaumeeChem) with utils::View(MaumeeChem)?

lsethna commented 7 years ago

Ooh, I get a popup window with the data table!

aappling-usgs commented 7 years ago

and same with View(as.data.frame(MaumeeChem)), right?

looks like the View issue is a conflict between tibble objects and the implementation of View in smwrQW, which gets loaded when loadflex is loaded. i'll add an issue to the smwrQW package, but our team does not maintain that package and so it may take a while to resolve. in the meantime, i hope the above two options will work for you.

lsethna commented 7 years ago

Yes, great! Do you have any insight on the issue with predictSolute() error message? Or is that completely unrelated?

aappling-usgs commented 7 years ago

as a matter of fact, i do! thanks to you and @lindsaycarr for getting the necessary information into this issue so i could debug. for that error,

> preds_li <- predictSolute(SS_li, "conc", estdat, se.pred=TRUE, date=TRUE)
Error in `$<-.data.frame`(`*tmp*`, "se.pred", value = 203.82035433711) : 
  replacement has 1 row, data has 0

I'm betting the problem is that estdat is empty, 0 rows.

library(readxl)
MaumeeChem <- as.data.frame(read_excel("~/../Downloads/MaumeeChem.truncated.xls.xlsx"))

library(loadflex)

# Interpolation data
intdat <- MaumeeChem[c("date","FlowCFS","SS_mgL")]

# Calibration data: Restrict to points separated by sufficient time
regdat <- subset(MaumeeChem)[c("date", "FlowCFS","SS_mgL")]

# Estimation data
estdat <- subset(MaumeeChem, date < as.POSIXct("1975-1-10 00:00:00"))
estdat <- estdat[seq(1, nrow(estdat))]
estdat
## [1] date
## <0 rows> (or 0-length row.names)
lsethna commented 7 years ago

Ah, I see that now! I must have edited it somehow because the estdat that I have in my Global Environment has over 18000 rows. I will keep plugging away, thank you both for all your help!!