Closed afredston closed 4 years ago
Thanks Alexa for posting here!
I don't have any intuition based on this output, but cold you make a minimal example using the format here (https://github.com/nwfsc-assess/geostatistical_delta-GLMM/wiki/How-to-create-a-minimal-example) for one of those five species and I could take a look and report back on this thread?
Sure! This minimal example still creates the same error for me:
#unlink(paste0(getwd(),"/Kmeans-100.Rdata")) # deletes Kmeans-100 from other region models, if switching regions
library(VAST)
library(TMB)
Version = get_latest_version( package="VAST" )
Method = c("Grid", "Mesh", "Spherical_mesh")[2]
grid_size_km = 50
n_x = 100
FieldConfig = c("Omega1"=0, "Epsilon1"=1, "Omega2"=0, "Epsilon2"=1)
RhoConfig = c("Beta1"=0, "Beta2"=0, "Epsilon1"=2, "Epsilon2"=2)
OverdispersionConfig = c("Eta1"=0, "Eta2"=0)
ObsModel = c(1,3)
Options = c("SD_site_density"=0, "SD_site_logdensity"=0, "Calculate_Range"=1, "Calculate_evenness"=0, "Calculate_effective_area"=1, "Calculate_Cov_SE"=0, 'Calculate_Synchrony'=0, 'Calculate_Coherence'=0)
strata.limits <- data.frame('STRATA'="All_areas")
Region = "Eastern_Bering_Sea"
Species_set = 100
DateFile = paste0(getwd(),'/VAST_output/')
if (!dir.exists(DateFile)){
dir.create(DateFile)
} else {
print(paste0(DateFile, " already exists"))
}
Catch_rates <- FishData::download_catch_rates(survey="Eastern_Bering_Sea", species_set=Species_set)
Extrapolation_List = make_extrapolation_info( Region=Region, strata.limits=strata.limits )
settings = make_settings( "n_x"=n_x, "Region"=Region, purpose="index", bias.correct=FALSE, use_anisotropy=FALSE, "FieldConfig"=FieldConfig, "RhoConfig"=RhoConfig, "OverdispersionConfig"=OverdispersionConfig,"Options"=Options,"ObsModel"=ObsModel )
DF <- Catch_rates[Catch_rates$Sci=="Bathyraja_parmifera",]
Data_Geostat = data.frame( "spp"=DF[,"Sci"], "Year"=DF[,"Year"], "Catch_KG"=DF[,"Wt"], "AreaSwept_km2"=0.01, "Vessel"=0, "Lat"=DF[,"Lat"], "Lon"=DF[,"Long"] )
# get rid of species observed everywhere to prevent VAST errors
Data_Geostat$nonzero = ifelse(Data_Geostat$Catch_KG > 0, "Y","N")
Data_Geostat$prop_nonzero = ave(Data_Geostat$nonzero, Data_Geostat$Year, FUN=function(x) mean(x=='Y'))
Data_Geostat <- Data_Geostat[Data_Geostat$prop_nonzero<1,]
Spatial_List = make_spatial_info( grid_size_km=grid_size_km, n_x=n_x, Method=Method, Lon_i=Data_Geostat[,'Lon'], Lat_i=Data_Geostat[,'Lat'], Extrapolation_List=Extrapolation_List, DirPath=DateFile, Save_Results=FALSE )
fit = fit_model("settings"=settings, "Lat_i"=Data_Geostat[,'Lat'],
"Lon_i"=Data_Geostat[,'Lon'], "t_i"=Data_Geostat[,'Year'],
"c_i"=rep(0,nrow(Data_Geostat)), "b_i"=Data_Geostat[,'Catch_KG'],
"a_i"=Data_Geostat[,'AreaSwept_km2'], "v_i"=Data_Geostat[,'Vessel'],
"getReportCovariance"=FALSE, "getJointPrecision"=TRUE,
lower=-Inf, upper=Inf,
test_fit = FALSE,
fine_scale=FALSE,
anisotropy=FALSE,
Use_REML=TRUE)
OK, "Bathyraja_parmifera" has 0% encounters for many years, which results in a non-identifable model when intercepts are treated as fixed effects. You're using REML=TRUE, which means the intercepts are optimized during the inner loop -- the beta1 for years with 0% encounters go to -Inf, which then results in a non-invertible inner-loop Hessian. I found this when switching to REML=FALSE and seeing the parameter estimates for beta1s were going towards -Inf (i.e., < -20).
A few suggestions:
I note that make_data(.)
should be doing an auto-check for simple issues like this, and I'm not sure why it's not throwing an informative error in this instance. I'll try to remember to check this later, and feel free to remind me in a few weeks.
I'm closing the issue for now, but feel free to keep posting here if other species aren't fixable using suggestions 1-2 above.
PS: I've just added this as an informative error in make_data(.)
in the VAST development branch, to clearly indicate this issue in the future
For future reference, here's how I addressed this issue:
I already had code in the single-species models to drop years when a species was encountered 100% of the time, because VAST can't use those to estimate an encounter rate. So I just added a line to also drop years when a species was encountered 0% of the time:
# below I'm getting rid of years where the species was observed 100% of the time
Data_Geostat$nonzero = ifelse(Data_Geostat$Catch_KG > 0, "Y","N")
Data_Geostat$prop_nonzero = ave(Data_Geostat$nonzero, Data_Geostat$Year, FUN=function(x) mean(x=='Y'))
Data_Geostat <- Data_Geostat[Data_Geostat$prop_nonzero<1,]
Data_Geostat <- Data_Geostat[Data_Geostat$prop_nonzero>0,] # also drop years where species was never observed
However, VAST still estimates ranges in years with no data. At the moment I'm doing a post-hoc comparison to drop years from the VAST output when I know there were no species observations (which is not necessary for all purposes but I think is appropriate for our questions), but if there's a more functional way to tell VAST to just skip those years entirely, let me know!
I'm running VAST single-species models on a list of species from the Eastern Bering Sea. I downloaded the 100 most frequently observed taxa using
FishData::download_catch_rates
, of which 82 were identified to species. Of those 82, VAST fails to fit 28, and at least the first 5 reach the stage of estimating range shifts and then report the same error:The list of species that doesn't converge is:
c("Bathyraja_parmifera", "Lepidopsetta_polyxystra", "Atheresthes_evermanni", "Leptasterias_polaris", "Pagurus_trigonocheirus", "Labidochirus_splendescens", "Chrysaora_melanaster", "Buccinum_scalariforme", "Pagurus_aleuticus", "Hippoglossoides_robustus", "Styela_rustica", "Pagurus_capillatus", "Pagurus_ochotensis", "Buccinum_angulosum", "Boltenia_ovifera", "Lethasterias_nanimensis", "Pagurus_rathbuni", "Leptasterias_arctica", "Buccinum_polare", "Eunoe_nodosa", "Pagurus_confragosus", "Ophiura_sarsii", "Bathyraja_interrupta", "Pteraster_obscurus", "Boreogadus_saida", "Myoxocephalus_scorpius", "Mactromeris_polynyma", "Eunoe_depressa" )