kaitlyngaynor / gorongosa-mesocarnivores

2 stars 0 forks source link

removing detection data corresponding to missing covariates before fitting the models #98

Open klg-2016 opened 3 years ago

klg-2016 commented 3 years ago

https://github.com/kaitlyngaynor/gorongosa-mesocarnivores/blob/91d55f0948657549a9f2f77570bf1fb5a8165d79/scripts/multi-season%20model/multi-season-sample.Rmd#L55

Am I right in my understanding of what this line is doing? Also, I kind of understand why it's doing this, but I'm still confused.

"For missing siteCovs, the entire row of data must be removed." --> so in this example, if a site's elevation or forest cover is missing, then the whole site is scrapped?

"However, for missing yearlySiteCovs or obsCovs, only the corresponding observation are removed." Does this just mean that the single data point is removed? In this example, obsCovs are the dates, so if they're missing (because a survey wasn't conducted), it's only the date column across the row that has NA?

kaitlyngaynor commented 3 years ago

I would guess that for missing yearlySiteCovs, you have to remove the records for that entire year. While for missing obsCovs, you remove the observation. But that's not what this says....

This DOES suggest to me that we can use cameras that were only up for one year, though?

Again, not totally sure.

To address the code-specific question, the operator != is logical, corresponding to "not equal to." is.na() is another logical operator that tests whether the thing inside it is NA or not, and will return TRUE or FALSE. I still don't totally understand this line out of context (base R can be a little inscrutable) but maybe that will help you

klg-2016 commented 3 years ago

I'm not sure... Would our discussion of not using date as an obsCov affect your comment on using cameras up for only one year?

Okay--that supports what I thought the code was doing, but still not 100% sure on why. I've added this topic to my list of things to check out in the Google group (as a place to start)

kaitlyngaynor commented 3 years ago

I'm not sure... Would our discussion of not using date as an obsCov affect your comment on using cameras up for only one year?

I don't think so, because I think we should use year as a yearlySiteCov. If I'm understanding correctly.

Okay--that supports what I thought the code was doing, but still not 100% sure on why. I've added this topic to my list of things to check out in the Google group (as a place to start)

Ok. How essential is this to our analysis, since we don't have any missing dates? Can we just ignore this bit?

klg-2016 commented 3 years ago

yes I think that's right for year.

That's a good question. Would NAs (from periods when the camera was down) be missing dates? part of this line tests whether the value in the detection history is NA, which it would be for when the camera is down. but if we're not using DATE the same way the sample code does (that's the matrix of Julian dates from each survey), then this line of code would be moot. you commented (here: https://github.com/kaitlyngaynor/gorongosa-mesocarnivores/issues/99#issuecomment-747820352) that you don't think we're going to have obsCovs, right? so this would be double moot?

kaitlyngaynor commented 3 years ago

Ooh. Hmm, yeah, I think this may not be relevant for us. I just had a look at the PDF and it seems like this is a workaround that lets you compare models (using AIC) in which some models have missing yearlySiteCovs or obsCovs and are thus removed from the analysis? Since you can only use AIC for model comparison if they are based on the exact same underlying data. Since we aren't using date as a covariate, I don't think it applies:

However, for missing yearlySiteCovs or obsCovs, only the corresponding observation are removed. Thus, if unmarked removes different observations from different models, the models cannot be compared using AIC. A way around this is to remove the detection data corresponding to missing covariates before fitting the models. The crossbill data have missing dates and so we remove the associated detection/non-detection data.

... if I'm understanding correctly

klg-2016 commented 3 years ago

I buy that--it makes sense to the extent that I understand what we're working on. Okay, so I will cut that code and keep going. I'm digging through the Google group now to see if there's anything about needing consistent sites across seasons, and I'll pay attention to see if there's anything that touches on this stuff as well to confirm.