eco4cast / neon4cast-beetles

2 stars 8 forks source link

counts, abundance, and trap-nights #7

Open cboettig opened 4 years ago

cboettig commented 4 years ago

Hi team, I'm still worried about our formulation of 'counts / trap-nights'. Sampling effort is not as clean as we've assumed (i.e. 40 traps per site up to 2017, then 30 traps after). That's the protocol, but the actual number of traps set varies a bit more than that, as we can tell from the field data table, e.g.

bet_fielddata %>% count(siteID, collectDate) 

shows a given collectDate usually collects the expected 40 traps prior to 2018 and 30 traps after, but might also collect far less, i.e. here's how many traps are returned:

 bet_fielddata %>% count(siteID, collectDate) %>% rename(n_traps = n) %>% count(n_traps, sort=TRUE)
# A tibble: 40 x 2
   n_traps     n
     <int> <int>
 1      30   999
 2      40   879
 3      15    65
 4      36    61
 5      29    55
 6       1    49

So to count up "trap-nights" at a given site for a given collection date, we need to use the fielddata table to add up the number of days each individual trap was set at each site,

bet_fielddata %>% group_by(siteID, collectDate) %>% summarize(trapnights = sum(trappingDays))

Also note this cannot be done directly from the bet_sorting table, even though we have collectDate and setDate, because that table does not include any traps that were set and collected but wound up being empty on that bout.

I think this number from the fielddata for trap nights is better, though there may still be improved ways to adjust for effort.

cboettig commented 4 years ago

relatedly -- is it odd not to also be adjusting richness for sampling effort (i.e. trap nights)? e.g. some sites will have a lot less trap-nights in a given bout, month, or year, and surely this contributes to ability to detect rare species at that site?

taddallas commented 4 years ago

That makes sense as a way to calculate trap nights, from what I understand after a brief read. I really am curious how much trap-nights would actually influence richness. In my mind, richness should be more robust to sampling error, as if we assume that each species has the same probability of being sampled across traps, trap-nights become a bit less important in estimating overall richness. We could explore this by removing certain sampling periods from the raw data and estimating richness. If I were to bet on it, I would argue that we'd have to exclude around 20% or so of sampled traps to have an appreciable difference on species richness. But I might just be overestimating our ability to get at those rare species.

cboettig commented 4 years ago

@taddallas great point, we should look at the data! Since there's already some natural variation in trapnights across the data, for a first pass we can see how this correlates with effort. Using the quick tables in my Rmds,

df <- left_join(richness, effort) %>%
  mutate(month = lubridate::month(month, label=TRUE)) 
df %>%
  ggplot(aes(trapnights, n)) + geom_point() + facet_wrap(~month) +
  geom_smooth(method='lm', formula= y~x)

trapnights-by-richness

err.. I dunno what that says. Certainly bouts at sites with really low trapnights will produce low estimates of richness. No idea why there's some outliers with huge trapnights, but a wild guess that those are driven by the "whoops I thought you picked up last months traps!, this trap has been out here for ages!". In any event outliers seem to get only average levels of richness.

Maybe this needs to be plotted by site to get a better sense as to whether we are saturating species richness or not (there's probably some well-developed theory to answer that question). I really don't have any viable biological intuition to go on, but given how many traps seem to show up empty or at least get less than one beetle a night, catching beetles seems to be harder than I thought, and so I'd guess you'd need a hell of a lot of trap effort to catch all the species present.

I suppose one could spitball the expected Carabid species richness from some occurrence maps for the "area" (iNaturalist? map of life?) According to wikipedia, there's roughly 2000 species of north american carabids, while it looks like there's 744 carabids in the NEON data. And 129 of those unique species have been seen only once by all the NEON traps.