jmsigner / amt

38 stars 13 forks source link

Vector allocation memory issue for hr_akde with "ou" ctmm model #51

Closed elmpero closed 3 years ago

elmpero commented 3 years ago

Hello! I am looking for assistance with hr_akde using an "ou" ctmm model. I'm looking to estimate aKDEs for a series of biweekly temporal windows for numerous individual elk sampled at a 5hr fix rate.

I am thrown the below memory error when I attempt aKDE using an "ou" ctmm model for biweekly periods. (Note that I can get the code provided to work for hr_od and hr_akde with the ctmm model set to 'iid')

Error: Problem with `mutate()` column `hr_akde_ou`.
i `hr_akde_ou = map(data, ~hr_akde(., model = fit_ctmm(., "ou")))`.
x cannot allocate vector of size 321108.4 Gb

I've provided reproducible code and attempted to attach data in the hopes that someone may be able to assist me with a fix for whatever I might be doing wrong?

(If there is a better location for posing this question, my apologies and please let me know where instead I should direct this inquiry!) Thank you! El Pero ellen.pero@umontana.edu

elktrks_5.zip

#read in track data
filtered_5time <- readRDS("elktrks_5.rds")

#save track class forlater
trk.class <- class(filtered_5time)

#Nest elk tracks
nest_5hr <- filtered_5time %>% nest(-id, -sex, -release_age, -release_cohort, -release_date) 

#make sure track classification remains
class(nest_5hr) <- trk.class

###subset temporal windows
#biweekly
nest_5hr_bi <- filtered_5time %>% nest(-id, -sex, -release_age, -release_cohort, -release_date, -bwfr) 

#make sure track classification remains
class(nest_5hr_bi) <- trk.class

#create akdes at 95% and 50% isopleths
hr_bw <- nest_5hr_bi %>%  
  mutate(n = map_int(data, nrow)) %>% 
  filter(n > 20) %>% 
  mutate(
    hr_akde_iid = map(data, ~ hr_akde(., model = fit_ctmm(., "iid"))),
    hr_akde_ou = map(data, ~ hr_akde(., model = fit_ctmm(., "ou"))),
    cor_akde_iid = map(data, ~ hr_akde(., model = fit_ctmm(., "iid"), levels= c(0.5))),
    cor_akde_ou = map(data, ~ hr_akde(., model = fit_ctmm(., "ou"), levels=c(0.5)))
  )

head(hr_bw)
saveRDS(hr_bw, "BWakdeUD.rds")

#to long format so we can use area function
hr_bw2 <- hr_bw %>% select(-data) %>%
  pivot_longer(hr_akde_iid:cor_akde_ou, names_to = "estimator",
               values_to = "hr")

str(hr_bw2, 2)

#area function 
hr_bw2.area <- hr_bw2 %>%
  mutate(hr_area = map(hr, hr_area)) %>%
  unnest(cols = hr_area)

head(hr_bw2.area, 10)
saveRDS(hr_bw2.area, "BWakdeAREA.rds")
jmsigner commented 3 years ago

Hi El,

My guess is that you just do not have enough memory to run all the models that you want to. The small sample data that you provided (thanks for a very nice reproducible example!) would require already 13 GB of RAM.

hr_bw <- nest_5hr_bi %>%  
  mutate(n = map_int(data, nrow)) %>% 
  filter(n > 20) 

hr_akde_iid = map(hr_bw$data[1], ~ hr_akde(., model = fit_ctmm(., "iid")))
hr_akde_ou = map(hr_bw$data[1], ~ hr_akde(., model = fit_ctmm(., "ou")))

print(object.size(hr_akde_iid), unit = "Mb")
print(object.size(hr_akde_ou), unit = "Mb")
2.4 * nrow(hr_bw) / 1024 # For me these are approx 13 GB

You also do not have to call the IID and the OU model twice, you could just do:

hr_bw <- nest_5hr_bi %>%  
  mutate(n = map_int(data, nrow)) %>% 
  filter(n > 20) %>% 
  mutate(
    hr_akde_iid = map(data, ~ hr_akde(., model = fit_ctmm(., "iid"), levels = c(0.95, 0.5))),
    hr_akde_ou = map(data, ~ hr_akde(., model = fit_ctmm(., "ou"), levels = c(0.95, 0.5)))
  )

I would probably try to run the model with fewer instances and then see when the code start breaking. Alternatively, you may not need to save everything, depending on your next analytical steps.

elmpero commented 3 years ago

Thanks for the response, @jmsigner! So appreciated. I'm so glad you were able to reproduce the code with the data I provided.

I don't think it's a RAM issue for my computer because, as shown below, if I just try the "ou" call with the data I shared with you [which is all the data I'm interested in at the moment] and track my memory usage, I see I have ~24-25 Gb of RAM available on my computer which should be enough? (of ~32 Gb total RAM on my machine) Also, the "iid" call works just fine for me -- which I believe you said you observed it to be about the same size as the "ou" object when you ran it at ~13 Gb? Finally, is it odd that the error says the vector size that can't be computed is 321108.4 Gb?

Is it possible it's something specific to the "ou" call? Are you aware of any limitations R might impose on object sizes and potential ways around them? In the meantime, I am trying to locate another machine. I will also try coarser temporal windows, but I really am interested in a finer temporal scale.

Thank you again! El

hr_bw <- nest_5hr_bi %>%  
  mutate(n = map_int(data, nrow)) %>% 
  filter(n > 20) %>% 
  mutate(
    hr_akde_ou = map(data, ~ hr_akde(., model = fit_ctmm(., "ou")))
  )
elmpero commented 3 years ago

An update:

The code (below) will work for tri-weekly periods instead of bi-weekly periods.

#read in data
filtered_5time <- readRDS("elktrks_5.rds")
trk.class <- class(filtered_5time)
###NOW subset temporal windows
#triweekly
nest_5hr_tri <- filtered_5time %>% nest(-id, -sex, -release_age, -release_cohort, -release_date, -twfr) 
#telling R that this is track rather than just data frame
class(nest_5hr_tri) <- trk.class

##triweekly, 95% home range
hr_tw <- nest_5hr_tri %>%  
  mutate(n = map_int(data, nrow)) %>% 
  filter(n > 20) %>% 
  mutate(
    hr_akde_ou = map(data, ~ hr_akde(., model = fit_ctmm(., "ou")))
  )

head(hr_tw)

#to long format so we can use area function
hr_tw2 <- hr_tw %>% select(-data) %>%
  pivot_longer(hr_akde_ou, names_to = "estimator",
               values_to = "hr")

str(hr_tw2, 2)
#area function 
hr_tw2.area <- hr_tw2 %>%
  mutate(hr_area = map(hr, hr_area)) %>%
  unnest(cols = hr_area)

head(hr_tw2.area, 10)

However, I'm thrown another vector allocation error

Error: Problem with `mutate()` column `hr_akde_ou`.
i `hr_akde_ou = map(data, ~hr_akde(., model = fit_ctmm(., "ou")))`.
x cannot allocate vector of size 130321.4 Gb

for monthly periods, as below:

#read in data
filtered_5time <- readRDS("elktrks_5.rds")
trk.class <- class(filtered_5time)
###NOW subset temporal windows
#monthly
nest_5hr_mth <- filtered_5time %>% nest(-id, -sex, -release_age, -release_cohort, -release_date, -mfr) 
#telling R that this is track rather than just data frame
class(nest_5hr_mth) <- trk.class

##monthly, 95% home range
hr_m <- nest_5hr_mth %>%  
  mutate(n = map_int(data, nrow)) %>% 
  filter(n > 20) %>% 
  mutate(
    hr_akde_ou = map(data, ~ hr_akde(., model = fit_ctmm(., "ou")))
  )

head(hr_m)

#to long format so we can use area function
hr_m2 <- hr_m %>% select(-data) %>%
  pivot_longer(hr_akde_ou, names_to = "estimator",
               values_to = "hr")

str(hr_m2, 2)
#area function 
hr_m2.area <- hr_m2 %>%
  mutate(hr_area = map(hr, hr_area)) %>%
  unnest(cols = hr_area)

head(hr_m2.area, 10)

There are only 2937 monthly periods as compared to 4100 tri-weekly periods and 6045 biweekly periods (before filtering to those with more than 20 locations), so I'm confused how the error pertains to memory if the tri-weekly periods which run should take more memory than the monthly periods which also won't run? I've also tried splitting the data into thirds to run in three separate batches, but am thrown the same vector allocation error with the same Gb figure.

Any insight or suggestions about what to try next would be greatly appreciated! Thanks, El Pero elktrks_5.zip

jmsigner commented 3 years ago

You could try something like this (untested):

hr_tw <- nest_5hr_tri %>%  
  mutate(n = map_int(data, nrow)) %>% 
  filter(n > 20) 

for (i in 1:nrow(hr_tw)) {
  res <- hr_akde(hr_tw$data[[i]], model = fit_ctmm(hr_tw$data[[i]], "ou"))
  saveRDS(res, paste0("akde_", i, ".rds"))
}

This will save all aKDE home-ranges on your hard drive and R does not need to allocate a big vector.

elmpero commented 3 years ago

Thanks for the suggestion, @jmsigner - unfortunately I am thrown the same vector allocation error. It contribute to my suspicion that there may be something else going on.

jmsigner commented 3 years ago

What is the value of i with the error occurs?