Closed wright13 closed 2 years ago
uploaded old .R script for connecting to SQL database (connection path removed) incase script or table structure is helpful. old_SQL_FTPC_database_script.r
That script is helpful! So looking at the data export you sent me, my understanding is that...
Correct me if I'm wrong on any of that. Are there any other exports in the old script that we want to include in the package? Table structure wouldn't necessarily be identical, just thinking about compiling a list of data tables that we want to have.
Gathering my thoughts ahead of our call - no need to respond to this before we chat :)
To do:
FTPC database documentation with relationship diagram
Fig.1 in that doc shows the relationship. These are the Tables needed:
tbl_Lg_Woody_Individual tbl_Tree_Canopy_Height tbl_Presence tbl_Sm_Woody_Tally tbl_Understory_Cover tbl_Woody_Debris
Would it be easiest to join/relate all associated tables and then pare down columns?
I think I would pare down the columns first, just to reduce the amount of data that we have to retrieve from the database. You'll probably have to keep the ID columns until after joining though. Either way will work fine though!
ok I'm going through each table in relationship diagram and using select() to choose columns like this:
tbl_Lg_Woody_Individual <- tbl(DB, "tbl_Lg_Woody_Individual") %>% select(Large_Woody_ID, Event_ID, Species_ID, Life_Form, Quad, Status, Height, Height_Dead, Boles, DBH, DBH_Basal, Vigor, Fruit_Flower, Rooting, Foliar, Caudex_Length, Shrublike_Growth, Resprouts, Measurement_Type)
Or is it better to just show the columns being dropped maybe??
tbl_Lg_Woody_Individual <- tbl(DB, "tbl_Lg_Woody_Individual") %>% select(-Sort_Order, -SSMA_TimeStamp)
I guess that could screw things up though if a column we don't want suddenly got added to table.
added "table_structure.R" to repository. It contains select() columns for each data table (and sub-data table, if needed)
Still need to do tbl_Events, tbl_Plots, tbl_Locations, tbl_Site, tlu_Species, and xref_Park_Species_Nativity.
How do we want the data to be formatted for use in this package? E.g. columns present (should each data table have Unit_Code, other columns from Visit?), column name format (I think this is already pretty consistent). Another way to think of this is how should a csv export of the raw data look?
@wright13 will work on getting more familiar with the data @jakegross808 will let Sarah know which views/queries (view = saved SQL query, basically) to look at