Closed lbross closed 2 years ago
test comment-- Lesley, can you add Sebastian to the distribution of when comments get entered for issues.
Adding a 2nd test comment to see if Sebastian gets the emails now.
Sebastian hasn't accepted the invite yet, so he will probably not get an email.
One more comment test?
Several fields don't enter Access nicely when imported from SQLite 3 database via ODBC driver. This leads to their being unavailable for use in linking, indexing, etc. among tables, rendering analysis use in Access very problematic (can't even filter or sort on some of these the way they come into Access, I think). While it is true that one can gather up these fields and tables from various Access databases scattered across the project hierarchy, the point of the SQLite database is to provide one file that stores outputs and input assumptions and parameters as were in place at the time the analysis was run. Ideally, one would be able to link to this SQLite from Access to do analysis; however, linking has been problematic and it is necessary to actually import the tables into an Access database (which is workable). However the data types to which variables import depends on how they are specified in the SQLite 3 database. For import to work under current ODBC drivers, the types TEXT, MEMO, and BYTE need to be avoided whenever possible (there may be a field or two that can be longer than 255 characters where we may have to accept a MEMO type, that will transition as an ill behaved, from the standpoint of linking and searching and filtering etc. LONG TEXT in Access). The attached spreadsheet provides recommendations as to what all fields should show up in Access as (to be consistent with their Access representations elsewhere in the BioSum hierarchy and to ensure their usability by analysts) and a crosswalk page indicating what I think the types need to be in SQLite to accomplish this. In most cases, there are choices to be made that I am unaware of the implications of- for example, both CHAR (X) and VARCHAR (X) enter Access as SHORT TEXT (X) fields-- I don't know which of these is better or more compatible in terms of other purposes and needs, but TEXT is definitely not a happy conversion from SQLite to Access.
Altogether, there are 72 variables for which a type change is needed in the SQLite DB3. A few of these are changes to DATE-- not sure how practicable or critical those are-- would be nice if it came through as DATE in Access, but it would not surprise me if that turns out to be challenging and could live with DATES being represented as SHORT TEXT in Access (perhaps VARCHAR (50) in SQLite).
Please see attached EXCEL workbook for details. DataTypeChangeNeedsForDB3export_20220413.xlsx
I started work on the spreadsheet and didn't quite make it all the way through. But I will post what I have so you can review before tomorrow if desired. For background, the data types in v5.9.0 were copied from the MS Access types of the source tables, but it seems that some of these remain problematic. In column 5 I suggest the SQLite data type that we should change to for those columns that are problematic. Since I'm not an expert on database schemas, here is the logic I followed:
Logic seems sound to me. I have surprise presentation in an hour, then meetings, so can't do anything today. I looked at POP tables in project/db/master.db -- makes sense to pattern off of how these are configured in the FIADB.DB distribution, probably. Looks like some more experimentation with the ODBC driver is needed to nail down some of these questions-- the empirical approach may be best. I wonder if there is more than one kind of date format in SQLite. Also does the ODBC driver work in both directions? If so, perhaps we should export Access data to DB and see what it becomes.
I tried rerunning the optimizer scenario to get a fresh Access optimizer results database and then exporting that and I am still getting TEXT in the POP tables and NUM type for the date attribute. I am also getting a fail after exporting the SQLite file in that I can't then run optimizer to load the same or another scenario. The Optimizer Scenario selection window just sits there and won't respond to any button- a task manager closeout is required.
I found that the SQLite call we were using the create the POP tables in the context.db takes it upon itself to change the data types to SQLite native data types when it copies the table. This needs to be reworked to create the table with the desired data types and then to copy the data over. Fixing this may delay the release of the .msi. I'm going to hold off on looking at the post-export fail until I get the data types fixed.
Understood (about the delay). Better to take it slow and get a fully useable export. I will try to connect with Olaf about the questions you posed that you thought he might be able to address.
We may not need Olaf's help. When we start producing the pop tables with varchar rather than text data types, that problem may go away. Date may still be outstanding. You can try linking to the pop tables in master.db with the date fields and see if they come across okay in Access. If they do, then we should be okay once the pop tables in context.db have our desired data types.
Here is the completed spreadsheet if you want to review. I did not go through the POP tables as I think those will self-correct once we update how they are created. DataTypeChangeNeedsForDB3export_20220413.xlsx
Looks good to me. I understand that any extra columns in tables that are not BioSum native will come as they come and we'll live with that.
I fixed the pop tables and uploaded a sample optimizer_results.db3 to the Temp30Days folder. This is the context.accdb only, to keep it small and is for BlueDemo. If you're able to work with these tables through MS Access, then our questions about VARCHAR and DATE fields are resolved. Still working on all the other tables.
These all look great! DATE came through with flying colors (though there is no data in the POP tables so I can't completely confirm until there is).
You are too efficient! There is data in the POP tables now. I noticed that the data was missing and tried to repost before you pulled them down, but I was too slow.
Well now there is data in the tables, but the date columns are all null. I guess I'll assume that they'd be OK if they were not null. They are not null in the FIADB version of the POP tables but the copy I have of the Bluedemo project currently has no content in the pop tables in its .db
The data for the pop tables was derived from the master_pop_PNW.db because the BlueDemo POP tables were rebuilt using this source. I traced it through and the date columns are null in master_pop_PNW.db. Hence they are null in master.db and ultimately context.db. When we have more recent projects with recent pop tables, we should be able to take a closer look at the dates. I just posted a new version of context .db to Box containing both optimizer_results and context data. This version should have the requested/updated data types. Once @jsfried approves, I will generate a new .msi with this and the other fixes.
I was already good with the data types, so let's consider this complete.
OK. I will generate a new .msi this afternoon.
The table schemas in optimizer_results.db use the standard SQLite datatypes: TEXT, INTEGER, and REAL. These data types cause issues when the tables are linked in MS Access using an ODBC driver. The FIA data mart uses MS Access data types even though it is an SQLite database. Olaf confirmed that the Tabling program can work with MS Access data types. The schemas need to be updated for these tables so that they use the same data types as the FIA data mart or BioSum. The easiest way to do this is to re-use the scripts that create these tables originally in MS Access, when applicable.