James-Thorson-NOAA / VAST

Spatio-temporal analysis of univariate or multivariate data, e.g., standardizing data for multiple species or stages
http://www.FishStats.org
GNU General Public License v3.0
124 stars 53 forks source link

parallelization for large spatial extents #177

Closed mkapur closed 4 years ago

mkapur commented 5 years ago

I'm having some computing power bottlenecks that gc() and switching to a 16gb computer cant solve, namely trying to make an extrapolation grid of the entire NE pacific (all/several AK regions, BC and Cal Current). I'm wondering if you have implemented any kind of parallel processing for this type of extent.

The make_spatial_info call results in an error ('Cannot allocate vector of X gb') which ranges from 19 to 32 with various combinations of regions. This is also true when I use strata.limits <- data.frame('STRATA'="All_areas") or specify bounds.

Thanks!

Example


Method = c("Grid", "Mesh", "Spherical_mesh")[2]      
grid_size_km = 25     
n_x = 50                   
Kmeans_Config = list( "randomseed"=1, "nstart"=50, "iter.max"=1e3 )  
FieldConfig = c("Omega1"=1, "Epsilon1"=1, "Omega2"=1, "Epsilon2"=1)    
RhoConfig = c("Beta1"=0, "Beta2"=0, "Epsilon1"=0, "Epsilon2"=0)    
OverdispersionConfig = c("Delta1"=0, "Delta2"=0)                       
bias.correct = FALSE
ObsModel = c(1,0) 
Options =  c("SD_site_density"=0, "SD_site_logdensity"=0, "Calculate_Range"=1, "Calculate_evenness"=0,"Calculate_effective_area"=1, "Calculate_Cov_SE"=1, 'Calculate_Synchrony'=0, 'Calculate_Coherence'=0)
EBS_extrap = make_extrapolation_info( Region = "Eastern_Bering_Sea", strata.limits = strata.limits, zone = 32, flip_around_dateline = F )
NBS_extrap = make_extrapolation_info( Region = "Northern_Bering_Sea", strata.limits = strata.limits, zone = 32, flip_around_dateline = F )
GOA_extrap = make_extrapolation_info( Region = "gulf_of_alaska", strata.limits = strata.limits, zone = 32, flip_around_dateline = T )
BC_extrap = make_extrapolation_info( Region = "british_columbia", strata.limits = strata.limits, zone = 32, flip_around_dateline = F )
BC_extrap$Data_Extrap$Area_km2 <- BC_extrap$Area_km2_x`
names(BC_extrap$a_el) <- "All_areas"
 CC_extrap = make_extrapolation_info( Region = "california_current", 
                                       strata.limits = strata.limits, zone = 32, flip_around_dateline = TRUE )`
CC_extrap$Data_Extrap$Area_km2 <- CC_extrap$Area_km2_x
Extrapolation_List = combine_extrapolation_info(
    "EBS" = EBS_extrap,
    "NBS" = NBS_extrap,
    "GOA" = GOA_extrap,
   "BC" = BC_extrap,
    "CC" = CC_extrap
  )
Spatial_List <- FishStatsUtils::make_spatial_info( grid_size_km=grid_size_km, n_x=n_x, Method=Method,   Lon=Data_Geostat[,'Lon'], Lat=Data_Geostat[,'Lat'],   Extrapolation_List=CC_extrap, randomseed=Kmeans_Config[["randomseed"]],       nstart=Kmeans_Config[["nstart"]], iter.max=Kmeans_Config[["iter.max"]],    DirPath=DateFile, Save_Results=FALSE )
jkbest2 commented 4 years ago

You wouldn't lose much by doing the extrapolations separately and carefully combining them post-hoc, but I don't think VAST is set up to do this.

colemonnahan commented 4 years ago

@mkapur I've never tried to combined regions like this before. Have you tried adding one at a time to see if it's truly a memory issue or a bug that manifests as a memory issue. I.e, if it doesnt' work combining two then it's probably a bug elsewhere..

Do you have a sense of which step it crashes on? What's the output right before crashing? I have a 64GB VM I could try to run it on next week when nothing else is running if that helps as a test case.

mkapur commented 4 years ago

Hi All I was able to find at least a temporary workaround by reducing the # knots I'm using and only running it on my desktop. The error always occurr(ed) during the make_spatial_info step after printing the 1:X knots output. My ultimate goal is to be able to automate model fitting at various stratifications so the post-hoc would probably be a last resort step. If I revisit this and/or need to increase the # knots I'll let you guys know. I'll close this for now otherwise. MK

James-Thorson commented 4 years ago

How many knots caused it to crash, and how many worked?

On Sat, Sep 21, 2019, 12:12 PM Maia Sosa Kapur notifications@github.com wrote:

Closed #177 https://github.com/James-Thorson-NOAA/VAST/issues/177.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/James-Thorson-NOAA/VAST/issues/177?email_source=notifications&email_token=AB46UTMYZQXKXJI3HIWOCETQKZWZ5A5CNFSM4IGINHZKYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOTYLHZ7A#event-2652273916, or mute the thread https://github.com/notifications/unsubscribe-auth/AB46UTNCYU4Q2G7ZJ3RTTITQKZWZ5ANCNFSM4IGINHZA .

mkapur commented 4 years ago

250 did not work, 100 did work on my 16GB machine.

On Sat, Sep 21, 2019 at 12:45 PM Jim Thorson notifications@github.com wrote:

How many knots caused it to crash, and how many worked?

On Sat, Sep 21, 2019, 12:12 PM Maia Sosa Kapur notifications@github.com wrote:

Closed #177 https://github.com/James-Thorson-NOAA/VAST/issues/177.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/James-Thorson-NOAA/VAST/issues/177?email_source=notifications&email_token=AB46UTMYZQXKXJI3HIWOCETQKZWZ5A5CNFSM4IGINHZKYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOTYLHZ7A#event-2652273916 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AB46UTNCYU4Q2G7ZJ3RTTITQKZWZ5ANCNFSM4IGINHZA

.

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub https://github.com/James-Thorson-NOAA/VAST/issues/177?email_source=notifications&email_token=ACTRXAEXJTQRCZIK3NRHY5TQKZ2WPA5CNFSM4IGINHZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7IYORY#issuecomment-533825351, or mute the thread https://github.com/notifications/unsubscribe-auth/ACTRXAFDSVFF3BGA7QGUQSDQKZ2WPANCNFSM4IGINHZA .

James-Thorson commented 4 years ago

well, if you make a minimal example I'd be happy to take a look...? seems like it should work with considerably more knots (with a single region I sometimes do 2000 knots)

On Sat, Sep 21, 2019 at 1:05 PM Maia Sosa Kapur notifications@github.com wrote:

250 did not work, 100 did work on my 16GB machine.

On Sat, Sep 21, 2019 at 12:45 PM Jim Thorson notifications@github.com wrote:

How many knots caused it to crash, and how many worked?

On Sat, Sep 21, 2019, 12:12 PM Maia Sosa Kapur <notifications@github.com

wrote:

Closed #177 https://github.com/James-Thorson-NOAA/VAST/issues/177.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <

https://github.com/James-Thorson-NOAA/VAST/issues/177?email_source=notifications&email_token=AB46UTMYZQXKXJI3HIWOCETQKZWZ5A5CNFSM4IGINHZKYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOTYLHZ7A#event-2652273916

, or mute the thread <

https://github.com/notifications/unsubscribe-auth/AB46UTNCYU4Q2G7ZJ3RTTITQKZWZ5ANCNFSM4IGINHZA

.

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub < https://github.com/James-Thorson-NOAA/VAST/issues/177?email_source=notifications&email_token=ACTRXAEXJTQRCZIK3NRHY5TQKZ2WPA5CNFSM4IGINHZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7IYORY#issuecomment-533825351 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ACTRXAFDSVFF3BGA7QGUQSDQKZ2WPANCNFSM4IGINHZA

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/James-Thorson-NOAA/VAST/issues/177?email_source=notifications&email_token=AB46UTL5WEMBFDJM6K22TTDQKZ47HA5CNFSM4IGINHZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7IYYIQ#issuecomment-533826594, or mute the thread https://github.com/notifications/unsubscribe-auth/AB46UTLLJMILGMSL7IVIWQLQKZ47HANCNFSM4IGINHZA .