Add special projects and special projects collections table to oracle database

EmilyMarkowitz-NOAA commented 11 months ago

Data product requested: A few new tables in Oracle to store our special collections data, possibly living in RACE_DATA or RACEBASE. Open to new names for these tables, but I'll call them special_projects and special_project_collections for the time being. I've put examples for these new tables in the FUTURE_ORACLE google spreadsheet for us to play with.

special_projects table: A table listing all approved and declined special projects (e.g., a cross between the special project google form output and this more cleaned up table I use in the Bering Sea data report. We should have a historical reference of all of the special projects we do or were requested of us and, more immediately, I need a table like this for the data report and other automated reporting.
special_project_collections (and a special_project_collections_var reference table): A table listing digitized versions of everything we collect for special projects on the survey. This table will be a bit trickier to set up and I have some ideas, but should be connected to the special_projects table through a, let's call it, special_projects_id key column and collection_type key column. This will allow us to be able to quantify how many, say, genetics/fish condition/otolith/stomach/etc samples we collect. I use a by-hand summarized version of this table for the data report. It would be great, instead of creating a new table, that this table can be integrated with the SPECIMEN table, which I think is also due for some reorganizing.

I know this will require further discussion and coordination, but I wanted to formalize this idea with a GitHub issue for us. It is hugely problematic that we do not currently document what special projects we do/don't do or save the data collected for those special projects.

Other bigger picture considerations There are unintended consequences of bringing this data into Oracle that we will need to consider. I think while considering the below points, we should be careful to think of short, medium, and long term changes we need to make and goals to achieve. Not everything has to be done at once and we can make incremental progress on these ideas. Other issues/efforts that will need to be addressed include:

The need to improve the specimen tablet collection interface/backend to be able to collect these data digitally. Currently, the on-deck specimen tablets are not able to collect these data.
We do not currently digitize data sheets for special projects when we get back from sea. As deck lead, I always take photos of the data sheets and save them with the data in the G drive, but they are far from 'digitized'.
This project would require a dedicated team of folks to search through old files and data reports to find what special projects we've done, and to reach out to PIs for more information/data of what was collected. We'll have to decide to what extent we search for historical data.

Other Tagging @zoyafuso-NOAA and https://github.com/afsc-gap-products/gap_products/issues/12 for awareness.

TLDR Short-term

Discuss data table structure. Set up data collecting efforts for this year's surveys, so we can test the best way to organize these tables and start making them on Oracle.
Develop ideal data integration and management plan.
Prep bag and tag data collection sheets and hand enter data collected for special projects.

Mid-term:

Read through old data reports, search through files, etc. to add data from the past.
Formalize data integration and management plan.
Integrate special project data collection metrics in samplesize table discussed in https://github.com/afsc-gap-products/gap_products/issues/12.

Long-term:

Build in way to collect these data with tablets.

Ned-Laman-NOAA commented 11 months ago

I have also initiated a future database issue/concern with Autumn and OFIS so this is sitting in their parking lot when they have time to address it. It's our (GAP's) contention that database structure and infrastructure changes are solidly in OFIS' court to create and maintain. Remains to be seen if we win that argument in a timely fashion though!

Ned-Laman-NOAA commented 5 months ago

A few things to update here:

1 - GAP's databases (e.g., RACEBase and RACE_DATA) will now be maintained by GAP in the form of Chris Anderson. We will be able to tap into DB expertise from OFIS to advise us but developing databases and structures within our schemata will be our thing going forward. 2 - Alex D as the Special Collections Coordinator will need to weigh in on this process. She is developing/maintaining the RFP process and Google Forms so that makes sense for the front end and recent data. She and I are in the process of starting to map out how to recover historic data. 3 - I agree with Em that this is an important and needs to be addressed in the short term to support our data reports and to meet our obligations for storing and making some form of these data searchable and accessible. 4 - I would place digitizing the data from Special Collections in the Long Term goal category and maybe even Long Long Term. Actually, I may need to be convinced that we actually need to store the special collection data at all. It would be simpler for us to store just the particulars of a project and a count of how many samples were collected rather than the data themselves.

afsc-gap-products / gap_products

Add special projects and special projects collections table to oracle database #41