USFS-PNW / Fia-Biosum-Manager

User interface and main code repository for Biosum
http://biosum.info/
Other
3 stars 3 forks source link

Migrate POP tables to SQLite #252

Closed lbross closed 2 years ago

lbross commented 3 years ago
  1. Create new master.db to hold pop tables
  2. Update pop table load process to incrementally update tables when new plots are added after original project creation rather than the current process: drop and recreate. The OK Tabling program requires pop tables for all plots, not just those most recently added. Database size should no longer be an issue with SQLite because there are no size limitations.
  3. The pop tables are used when calculating the cond prop adjustment factor during plot load. The queries for this calculation include the currently selected evalid and rscd so having records on these tables for multiple evaluations should not be an issue
  4. Recommend adding Growth_Acct and LAND_ONLY fields to the POP_EVAL table. These fields are required by OK Tabling and we currently add them to the Optimizer context database. They exist and are populated in FIADB. Why not just add them from the beginning?
  5. Update new project algorithm to add project data source entries for pop tables as .db format
  6. Write version_control to migrate pop tables from .accdb to sqlite for existing projects and update existing project data source entries
lbross commented 3 years ago

Items 1-4 on this are done. I plan to hold off on migrating BioSum to use the SQLite POP tables until after the next release, but they will be created for OK Tabling reporting. I still need to update Optimizer to use the SQLite tables when creating the context database rather than importing from the Access tables. The context db creation should be faster. Note that the POP tables in the master database are used to populate the screens in addition to archiving the evals. When the screens are populated, temporary records are added with a biosum_status_cd of 9. When the plots finish loading, the biosum_status_cd is changed to a 1. Note that if the screens are populated but plots are NOT successfully loaded, there will be records in the tables with a biosum_status_cd of 9 that may not be used by any master.plots. The 2 most likely cases of this happening are the analyst deciding not to load plots after looking at the screens or an error occurring during the plot load process. The design (pre-existing) cleans these excess records out every time the screens are loaded. I don't think potentially having the extra records around is a big deal but wanted to record this in case any questions come up later.

lbross commented 3 years ago

@jsfried What are your thoughts about copying the POP tables from legacy projects to the new SQLite Pop tables? I can write a script to transfer what is currently in the .accdb POP tables, but as we know, this will only be from the most recent plot load. I've written some version control code that creates the SQLite Pop tables in legacy projects to avoid errors. Records will be added if plots are loaded.

jsfried commented 3 years ago

@lbross Programming and testing this seems like overkill if we can come up with a protocol alternative for me (and the post-doc and any other analyst working with old projects) to follow to bring along the POP tables from a limited number of existing projects that we care about moving forward (3 FTCE projects, Blue Demo, Blue12, CEC). i.e., a blank SQLite DB to move stuff into and an Access compatible SQL query that will write the needed info from the existing POP tables into new tables in the .DB -- we are perfectly capable of following instructions to do this and pasting an SQL script into a blank Access query window.

lbross commented 3 years ago

Thanks for your prompt reply. In order to rebuild the complete POP tables, you would need to have access to the FIA Datamart for all of the evalids that were used in the older projects. Do you think it is reasonable that these would be accessible? And do you want the SQL queries to read from MS Access FIA Datamart tables, SQLite Datamart tables, or a set of scripts for each platform?

jsfried commented 3 years ago

@lbross Oh fiddlesticks... had forgotten about the FTCE project having overwritten POP tables-- the others may be OK (since they are single state). Whichever is easier for you (Access or SQLite). Jeremy will need to build such as POP set in one or the other of these DB formats and have reached out to Olaf on advice as to where to find source data to populate such master POP tables (that include all EVALIDs for Volume and Area sets of 10 panels starting with 2001-2010 up through 2011-2020. While we have used the GRM EvalIds for a couple of projects involving FVS validation, Jeremy does not ee a need to move those forward.

lbross commented 3 years ago

For now I will write the migration scripts in SQLite since that is Olaf's platform of choice. My hunch is that the data mart tables will be more readily available in that format. The output will definitely be in SQLite so that OK Tabling can consume it. There will be some Access queries so that you can easily determine which evalids you need.

Updated: Do we have a way to know which evalids were used to load plots into a project? I'm looking at the plot and cond tables and don't see any indicators.

lbross commented 2 years ago

Revisit how we manage the status codes of these tables. Reloading the tables from the same state data mart may cause the status codes to be set from 1 to 9 so data could be lost. Consider a separate set of temporary tables to feed the screens and plot load process.

lbross commented 2 years ago

Add an audit indicating which plot.plot_cn's don't have records in the PPSA table

lbross commented 2 years ago
  1. Move the POP tables that are currently in the master.mdb to the temporary .accdb that is used to calculate the adjustment factors. It is confusing having two sets of POP tables in a project and the .mdb versions are neither historical nor reliable as they are used to populate the screens.
  2. When the button is clicked on the eval selection page, check for existing evals in the same state on the pop_eval table. If they exist, pop a warning message. Verbiage needed from @jsfried.
sebbusby commented 2 years ago

New warning message is working correctly when user attempts to load new plots using the same state and evalid from previously loaded plots. Nice addition.

lbross commented 2 years ago

This query will identify plots that are in the plot table but missing from the pop_plot_stratum_assgn table and should be run against the optimizer_results.db before using the Tabling program. At some point, we may wish to incorporate it into an audit.

SELECT biosum_plot_id, cn FROM plot 
WHERE trim(cn) NOT IN (SELECT plt_cn FROM pop_plot_stratum_assgn WHERE trim(plot.cn) = pop_plot_stratum_assgn.plt_cn)
lbross commented 1 year ago

Update sql that creates temporary access tables to resize pop_estn_unit.p1source to varchar(50) and remove field from pop_eval schema because it no longer exists on that table.