Open EmilyMarkowitz-NOAA opened 11 months ago
I have also initiated a future database issue/concern with Autumn and OFIS so this is sitting in their parking lot when they have time to address it. It's our (GAP's) contention that database structure and infrastructure changes are solidly in OFIS' court to create and maintain. Remains to be seen if we win that argument in a timely fashion though!
A few things to update here:
1 - GAP's databases (e.g., RACEBase and RACE_DATA) will now be maintained by GAP in the form of Chris Anderson. We will be able to tap into DB expertise from OFIS to advise us but developing databases and structures within our schemata will be our thing going forward. 2 - Alex D as the Special Collections Coordinator will need to weigh in on this process. She is developing/maintaining the RFP process and Google Forms so that makes sense for the front end and recent data. She and I are in the process of starting to map out how to recover historic data. 3 - I agree with Em that this is an important and needs to be addressed in the short term to support our data reports and to meet our obligations for storing and making some form of these data searchable and accessible. 4 - I would place digitizing the data from Special Collections in the Long Term goal category and maybe even Long Long Term. Actually, I may need to be convinced that we actually need to store the special collection data at all. It would be simpler for us to store just the particulars of a project and a count of how many samples were collected rather than the data themselves.
Data product requested: A few new tables in Oracle to store our special collections data, possibly living in
RACE_DATA
orRACEBASE
. Open to new names for these tables, but I'll call themspecial_projects
andspecial_project_collections
for the time being. I've put examples for these new tables in the FUTURE_ORACLE google spreadsheet for us to play with.special_projects
table: A table listing all approved and declined special projects (e.g., a cross between the special project google form output and this more cleaned up table I use in the Bering Sea data report. We should have a historical reference of all of the special projects we do or were requested of us and, more immediately, I need a table like this for the data report and other automated reporting.special_project_collections
(and aspecial_project_collections_var
reference table): A table listing digitized versions of everything we collect for special projects on the survey. This table will be a bit trickier to set up and I have some ideas, but should be connected to thespecial_projects
table through a, let's call it,special_projects_id
key column andcollection_type
key column. This will allow us to be able to quantify how many, say, genetics/fish condition/otolith/stomach/etc samples we collect. I use a by-hand summarized version of this table for the data report. It would be great, instead of creating a new table, that this table can be integrated with theSPECIMEN
table, which I think is also due for some reorganizing.I know this will require further discussion and coordination, but I wanted to formalize this idea with a GitHub issue for us. It is hugely problematic that we do not currently document what special projects we do/don't do or save the data collected for those special projects.
Other bigger picture considerations There are unintended consequences of bringing this data into Oracle that we will need to consider. I think while considering the below points, we should be careful to think of short, medium, and long term changes we need to make and goals to achieve. Not everything has to be done at once and we can make incremental progress on these ideas. Other issues/efforts that will need to be addressed include:
Other Tagging @zoyafuso-NOAA and https://github.com/afsc-gap-products/gap_products/issues/12 for awareness.
TLDR Short-term
Mid-term:
samplesize
table discussed in https://github.com/afsc-gap-products/gap_products/issues/12.Long-term: