Closed wright13 closed 3 years ago
I figured out what I was doing wrong that prevented ReadFTPC from returning database table connections (i.e. without calling dplyr::collect() first)! Which means that we should be able to just do our filtering and summaries on the database side for the large data tables without making any big changes to the existing code. Certain things will be unavoidably slow, like exporting the full dataset to CSV, but it looks to me like most of the summary statistics listed in the background doc can be done using dbplyr.
Species coverage is a large enough dataset that reading it into memory from the db is fairly time consuming. Sarah will come up with some options to run by Jake