Trawl: Specimen labels: Restructure how specimen label numbers are assigned. (Fix)

ghost commented 4 years ago

It is not clear how specimen label numbers (the last four characters on the tag label, e.g., 042B) are assigned. It seems like for a project with multiple species, it begins at 001A for each species, and is somewhat tied to tow numbers. This has caused problems with duplicate numbers in 2016, as well as with the new 'shortcut' (upper right-hand number on the tag) label that began in 2018. My solution for this: instead of starting at the species level, start at the project level (last set of 3-digits in the long specimen label code) and assign starting at one and increase by one with every sample taking. Example, starting for rockfish stomachs on the Excalibur: bocaccio stomach from tow #1: 2020-008-001-002-001A; then yellowtail stomach from tow #1: 2020-008-002-002-002A; then canary stomach from tow #2: 2020-008-002-002-003A. Yes, this will give us only 999 available specimen label numbers per project by doing it this way, but that is per vessel. We never collect that many stomachs, tissues, ovaries or any other structures where we print a specimen label on a single vessel.

jimfellows-NOAA commented 4 years ago

Poking around, I think steps to at least get to generating specimen IDs:

Launch Field Collector - Trawl Survey - Backdeck app (main_trawl_backdeck.py)
Click "Select Haul", and select haul (or create test haul), go back to main page
Click "Process Catch", and create species sample record
Click "Fish Sampling" button after creating species record
Create entry for sample. E.g. select sex, enter length for sex & length protocol.
Click "Special Action" tab in upper right corner
Click "Assign Tag ID"

Tag ID doesn't look exactly identical to what's listed in description, so maybe I'm in the wrong place:

"Assign Tag ID" button looks like its calling the method get_tag_id in py.trawl.SpecialActions, so that may be where this fix needs to take place.

jimfellows-NOAA commented 4 years ago

current tag id composition:

SURVEY_YEAR - VESSEL_ID - HAUL_NUMBER - SPECIMEN_TYPE_ID - SPECIMEN_NUMBER (ALPHA CHAR FOR PRINT LABEL)

SURVEY_YEAR: pulled from TrawlBackdeckDB_model.Settings object where param = 'Survey Year' VESSEL_ID: same as survey year but where param = 'Vessel ID' HAUL_NUMBER: StateMachine.haul.haul_number (originates from SQL query) SPECIMEN_TYPE_ID: Zero-padded PI_ACTION_CODE_ID value from PI_ACTION_CODES_LU (e.g. Whole Specimen ID = 5 --> 005)

The app should find any tag ids already entered in the SQLite database that are identical and either throw a duplicate tag id warning or increment the new tag ids. @peterfrey-NOAA is duplicate tag id an issue when multiple SQLite database files are involved, and therefore not caught until shoreside, or is it happening within a single app?

Wondering if another solution would be to add month and day to the tagID (e.g. first section would be 20200811 instead of just 2020) with the assumption that all tagIDs for that day will have to exist within the same DB file and will therefore be caught if duplicated. Would be a very simple, non-invasive fix in terms of code changes.

peterfrey-NOAA commented 4 years ago

We usually catch these duplicates shoreside, but I think they occur more frequently than when multiple SQLite databases are in play (like if we have to switch to a different backdeck laptop), so I believe there is an actual glitch in the app. Let me do some digging and get more information about this from Melissa and Aaron, who come across this problem when working up stomachs and ovaries. I will ask about your notion of incorporating the date in the tag ID to see if that would work

jimfellows-NOAA commented 4 years ago

Sounds good. The switching laptop makes sense. Seems like the app itself is working ok (there's no way it can know if a tag id exists on a computer, regardless of its structure) but maybe as another option we could keep the tag ID with the same composition and store the name of the laptop and/or the time of entry alongside the specimen record, since that's the data being lost when laptops are switched.

Hearing what issues it causes Melissa and Aaron shoreside would definitely be good, and maybe they'd have a good idea how what downstream changes if any need to be made if the tag id changes structure.

jimfellows-NOAA commented 4 years ago

SPECIMEN_TAG_VW to identify shortened tag dupes created and stored on branch fix-trawl-specimen-labels. Changes to allow database to store the computer that entered the specimen in the specimens table also made to specialActions.py on same branch for now. Calling this a wrap, leaving open until branch merge.

nwfsc-fram / pyFieldSoftware

Trawl: Specimen labels: Restructure how specimen label numbers are assigned. (Fix) #119