aodn / nrmn-application

A web application for collation, validation, and storage of all data obtained during surveys conducted by the NRMN
GNU General Public License v3.0
4 stars 3 forks source link

ingested sheet not reflecting what has been ingested correctly #1356

Open bpasquer opened 6 months ago

bpasquer commented 6 months ago

Further comments regarding JOB ID 147 that need further investigations: Lizzi: Some sized method 2 and 0 invertebrate data was also not ingested where total does equal columns T-AV (non-sized inverts and sized columns) – eg. lobster highlighted orange in attached. Note that the job was staged under the RLS program without extended invert sizing, so we cannot understand why lobsters were affected in particular. Bene: Attached(RLS-CANADA-JOB147_ingested) is the extract of the observations ingested in the database. The extract contains the M0 and M2 lobster records.

Lizzi: I’ve looked at this again and can confirm that the observations are present in the database (looking at the UI corrections tool for survey 923401379). However the ‘view ingested sheet’ feature does not stage or export these records, which is strange – is this supposed to reflect the data that got ingested or the last save of data in the staging platform (regardless it seems broken)? Perhaps this feature needs to be reviewed in reflection of this and the next points below. I noticed that the debris-zero records for this survey were actually not ingested for this survey (looking at the UI corrections tool for survey 923401379) even though they appear in the ‘ingested sheet’. I double checked in the debris extract and they are not present. The reason they were not ingested may be as for the point above (missing 0 in the inverts column) as you suggest in the last point, so perhaps it an issue of the ‘view ingested sheet’ feature being very inconsistent and not reflecting what was actually ingested – sometimes including things not ingested and sometimes excluding things that were!?

Originally posted by @bpasquer in https://github.com/aodn/nrmn-application/issues/1354#issuecomment-2087985736

utas-raymondng commented 6 months ago

export_2022 RLS MPA TEMPLATE WITH INVERT SIZE AND ALGAE_SA_Amended JB (1)(1).xlsx RLS CANADA_2021_missingdata_ingested(1).xlsx RLS CANADA_2021_missing_data_staged_Highlighted(1).xlsx

utas-raymondng commented 6 months ago

I download the production dump and I am sure that the ingestion should not ingest the lobsters into the survey because the staged_row do not contains those entries.

SELECT * FROM nrmn.staged_row where staged_job_id = 147 ORDER BY id ASC

``id block buddy code common_name created date depth direction diver inverts is_invert_sizing last_updated latitude longitude measure_value method pqs site_name site_no species time total vis position staged_job_id
139169 2 Andrew Perry Mvi Capelin 2023-01-17 17:11:16.201+11 19/07/2021 5 NE Jasmin Schuster 0   2023-01-17 17:11:16.201+11 47.65213 -52.69617 {"3":"500","4":"3000","5":"3000","6":"3000","7":"500"} 1 GB Middle Cove NFLD1 Mallotus villosus 10:30 10000 7 1000 147
139170 1 Andrew Perry Mvi Capelin 2023-01-17 17:11:16.201+11 19/07/2021 5 NE Jasmin Schuster 0   2023-01-17 17:11:16.201+11 47.65213 -52.69617 {"3":"500","4":"3000","5":"3000","6":"3000","7":"500"} 1 GB Middle Cove NFLD1 Mallotus villosus 10:30 10000 7 2000 147
139171 1 Andrew Perry Pam winter flounder 2023-01-17 17:11:16.201+11 19/07/2021 5 NE Jasmin Schuster 0   2023-01-17 17:11:16.201+11 47.65213 -52.69617 {"12":"1"} 1 GB Middle Cove NFLD1 Pseudopleuronectes americanus 10:30 1 7 3000 147
139176 2 Andrew Perry pgu Rock gunnel 2023-01-17 17:11:16.201+11 19/07/2021 5 NE Jasmin Schuster 0   2023-01-17 17:11:16.201+11 47.65213 -52.69617 {"6":"1"} 2 GB Middle Cove NFLD1 Pholis gunnellus 10:30 1 7 8000 147
139181 2 Andrew Perry dez No Debris found 2023-01-17 17:11:16.201+11 19/07/2021 5 NE Jasmin Schuster 0   2023-01-17 17:11:16.201+11 47.65213 -52.69617 {} 2 GB Middle Cove NFLD1 Debris - Zero 10:30 0 7 13000 147
139184 1 Andrew Perry Usu Radiated shanny 2023-01-17 17:11:16.201+11 19/07/2021 5 NE Jasmin Schuster 0   2023-01-17 17:11:16.201+11 47.65213 -52.69617 {"2":"1"} 2 GB Middle Cove NFLD1 Ulvaria subbifurcata 10:30 1 7 16000 147
139192 1 Andrew Perry dez No Debris found 2023-01-17 17:11:16.202+11 19/07/2021 5 NE Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.65213 -52.69617 {} 2 GB Middle Cove NFLD1 Debris - Zero 10:30 0 7 24000 147
139195 1 Andrew Perry Tad Cunner 2023-01-17 17:11:16.202+11 19/07/2021 8 SW Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.62728 -52.85769 {"3":"8","4":"15","5":"12","6":"12","7":"5","8":"3"} 1 KW Portugal Cove NFLD4 Tautogolabrus adspersus 12:20 55 6 27000 147
139196 1 Andrew Perry Pam Winter flounder 2023-01-17 17:11:16.202+11 19/07/2021 8 SW Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.62728 -52.85769 {"7":"1"} 1 KW Portugal Cove NFLD4 Pseudopleuronectes americanus 12:20 1 6 28000 147
139197 2 Andrew Perry Tad Cunner 2023-01-17 17:11:16.202+11 19/07/2021 8 SW Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.62728 -52.85769 {"3":"60","4":"182","5":"60","6":"37","7":"25","8":"2","9":"1"} 1 KW Portugal Cove NFLD4 Tautogolabrus adspersus 12:20 367 6 29000 147
139200 1 Andrew Perry pgu Rock gunnel 2023-01-17 17:11:16.202+11 19/07/2021 8 SW Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.62728 -52.85769 {"6":"1"} 2 KW Portugal Cove NFLD4 Pholis gunnellus 12:20 1 6 32000 147
139201 1 Andrew Perry Pam winter flounder 2023-01-17 17:11:16.202+11 19/07/2021 8 SW Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.62728 -52.85769 {"5":"1"} 2 KW Portugal Cove NFLD4 Pseudopleuronectes americanus 12:20 1 6 33000 147
139207 2 Andrew Perry pgu Rock gunnel 2023-01-17 17:11:16.202+11 19/07/2021 8 SW Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.62728 -52.85769 {"5":"1","6":"1","7":"1"} 2 KW Portugal Cove NFLD4 Pholis gunnellus 12:20 3 6 39000 147
139212 2 Andrew Perry dez No Debris found 2023-01-17 17:11:16.202+11 19/07/2021 8 SW Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.62728 -52.85769 {} 2 KW Portugal Cove NFLD4 Debris - Zero 12:20 0 6 44000 147
139213 1 Andrew Perry dez No Debris found 2023-01-17 17:11:16.202+11 19/07/2021 8 SW Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.62728 -52.85769 {} 2 KW Portugal Cove NFLD4 Debris - Zero 12:20 0 6 45000 147
139214 1 Andrew Perry Msc short horned sculpin 2023-01-17 17:11:16.202+11 20/07/2021 9 SW Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.66497 -52.72689 {"10":"1"} 1 GB Tappers Cove NFLD2 Myoxocephalus scorpius 10:30 1 5 46000 147
139215 1 Andrew Perry Pam winter flounder 2023-01-17 17:11:16.202+11 20/07/2021 9 SW Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.66497 -52.72689 {"9":"1","11":"1"} 1 GB Tappers Cove NFLD2 Pseudopleuronectes americanus 10:30 2 5 47000 147
139217 1 Andrew Perry Gmo Atlantic cod YOY 2023-01-17 17:11:16.202+11 20/07/2021 9 SW Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.66497 -52.72689 {"1":"1","3":"5","4":"5","5":"5"} 1 GB Tappers Cove NFLD2 Gadus morhua 10:30 16 5 49000 147
139218 2 Andrew Perry Pam winter flounder 2023-01-17 17:11:16.202+11 20/07/2021 9 SW Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.66497 -52.72689 {"10":"1"} 1 GB Tappers Cove NFLD2 Pseudopleuronectes americanus 10:30 1 5 50000 147
139219 2 Andrew Perry Gmo Atlantic cod YOY 2023-01-17 17:11:16.202+11 20/07/2021 9 SW Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.66497 -52.72689 {"3":"10","4":"20","5":"10"} 1 GB Tappers Cove NFLD2 Gadus morhua 10:30 40 5 51000 147
139223 1 Andrew Perry Usu Radiated shanny 2023-01-17 17:11:16.202+11 20/07/2021 9 SW Jasmin Schuster 0   2023-01-17 17:11:16.202+11 47.66497 -52.72689 {"1":"1","3":"1","4":"2"} 2 GB Tappers Cove NFLD2 Ulvaria subbifurcata 10:30 4 5 55000 147
139225 1 Andrew Perry Pam winter flounder 2023-01-17 17:11:16.203+11 20/07/2021 9 SW Jasmin Schuster 0   2023-01-17 17:11:16.203+11 47.66497 -52.72689 {"9":"1"} 2 GB Tappers Cove NFLD2 Pseudopleuronectes americanus 10:30 1 5 57000 147
139233 2 Andrew Perry Usu Radiated shanny 2023-01-17 17:11:16.203+11 20/07/2021 9 SW Jasmin Schuster 0   2023-01-17 17:11:16.203+11 47.66497 -52.72689 {"3":"1","4":"4","5":"2","6":"1"} 2 GB Tappers Cove NFLD2 Ulvaria subbifurcata 10:30 8 5 65000 147
139236 2 Andrew Perry mae grubby sculpin 2023-01-17 17:11:16.203+11 20/07/2021 9 SW Jasmin Schuster 0   2023-01-17 17:11:16.203+11 47.66497 -52.72689 {"4":"1"} 2 GB Tappers Cove NFLD2 Myoxocephalus aenaeus 10:30 1 5 68000 147
139239 0 Andrew Perry Tad Cunner 2023-01-17 17:11:16.203+11 20/07/2021 9 SW Jasmin Schuster 0   2023-01-17 17:11:16.203+11 47.66497 -52.72689 {"8":"1"} 0 GB Tappers Cove NFLD2 Tautogolabrus adspersus 10:30 1 5 71000 147
139241 1 Andrew Perry Gmo Atlantic cod YOY 2023-01-17 17:11:16.203+11 20/07/2021 6 SW Jasmin Schuster 0   2023-01-17 17:11:16.203+11 47.66497 -52.72689 {"3":"11","4":"12","5":"20"} 1 GB Tappers Cove NFLD2 Gadus morhua 12:00 43 5 73000 147
139242 2 Jasmin Schuster Usu Radiated shanny 2023-01-17 17:11:16.203+11 20/07/2021 6 SW Andrew Perry 0   2023-01-17 17:11:16.203+11 47.66497 -52.72689 {"2":"1"} 1 GB Tappers Cove NFLD2 Ulvaria subbifurcata 12:00 1 5 74000 147
139243 2 Jasmin Schuster pgu Rock gunnel 2023-01-17 17:11:16.203+11 20/07/2021 6 SW Andrew Perry 0   2023-01-17 17:11:16.203+11 47.66497 -52.72689 {"3":"1"} 1 GB Tappers Cove NFLD2 Pholis gunnellus 12:00 1 5 75000 147
139244 2 Jasmin Schuster Gmo Atlantic cod YOY 2023-01-17 17:11:16.203+11 20/07/2021 6 SW Andrew Perry 0   2023-01-17 17:11:16.203+11 47.66497 -52.72689 {"2":"1","4":"3","6":"4"} 1 GB Tappers Cove NFLD2 Gadus morhua 12:00 8 5 76000 147
139250 1 Andrew Perry Usu Radiated shanny 2023-01-17 17:11:16.203+11 20/07/2021 6 SW Jasmin Schuster 0   2023-01-17 17:11:16.203+11 47.66497 -52.72689 {"2":"2","3":"3","4":"1","5":"1"} 2 GB Tappers Cove NFLD2 Ulvaria subbifurcata 12:00 7 5 82000 147
139254 1 Andrew Perry pgu Rock gunnel 2023-01-17 17:11:16.203+11 20/07/2021 6 SW Jasmin Schuster 0   2023-01-17 17:11:16.203+11 47.66497 -52.72689 {"2":"1","4":"1","5":"1"} 2 GB Tappers Cove NFLD2 Pholis gunnellus 12:00 3 5 86000 147
139260 2 Jasmin Schuster Usu Radiated shanny 2023-01-17 17:11:16.203+11 20/07/2021 6 SW Andrew Perry 0   2023-01-17 17:11:16.203+11 47.66497 -52.72689 {"2":"1","3":"4","4":"3","6":"1"} 2 GB Tappers Cove NFLD2 Ulvaria subbifurcata 12:00 9 5 92000 147
139262 2 Jasmin Schuster mae grubby sculpin 2023-01-17 17:11:16.203+11 20/07/2021 6 SW Andrew Perry 0   2023-01-17 17:11:16.203+11 47.66497 -52.72689 {"6":"1"} 2 GB Tappers Cove NFLD2 Myoxocephalus aenaeus 12:00 1 5 94000 147
139263 2 Jasmin Schuster Gmo Atlantic cod YOY 2023-01-17 17:11:16.204+11 20/07/2021 6 SW Andrew Perry 0   2023-01-17 17:11:16.204+11 47.66497 -52.72689 {"6":"1"} 2 GB Tappers Cove NFLD2 Gadus morhua 12:00 1 5 95000 147
139266 1 Jasmin Schuster Gmo Atlantic cod YOY 2023-01-17 17:11:16.204+11 21/07/2021 8 NE Andrew Perry 0   2023-01-17 17:11:16.204+11 47.72255 -52.83521 {"4":"2","6":"1"} 1 GB Bauline NFLD13 Gadus morhua 11:00 3 3 98000 147
139267 1 Jasmin Schuster Tad Cunner 2023-01-17 17:11:16.204+11 21/07/2021 8 NE Andrew Perry 0   2023-01-17 17:11:16.204+11 47.72255 -52.83521 {"6":"1","7":"5","8":"2","9":"1"} 1 GB Bauline NFLD13 Tautogolabrus adspersus 11:00 9 3 99000 147
139268 2 Andrew Perry Tad Cunner 2023-01-17 17:11:16.204+11 21/07/2021 8 NE Jasmin Schuster 0   2023-01-17 17:11:16.204+11 47.72255 -52.83521 {"2":"1","3":"3","4":"4","5":"4","6":"4","7":"2","8":"3","9":"1","10":"1"} 1 GB Bauline NFLD13 Tautogolabrus adspersus 11:00 23 3 100000 147
139269 2 Andrew Perry Gmo Atlantic cod YOY 2023-01-17 17:11:16.204+11 21/07/2021 8 NE Jasmin Schuster 0   2023-01-17 17:11:16.204+11 47.72255 -52.83521 {"3":"1","4":"1"} 1 GB Bauline NFLD13 Gadus morhua 11:00 2 3 101000 147
139275 1 Andrew Perry pgu Rock gunnel 2023-01-17 17:11:16.204+11 21/07/2021 8 NE Jasmin Schuster 0   2023-01-17 17:11:16.204+11 47.72255 -52.83521 {"5":"2"} 2 GB Bauline NFLD13 Pholis gunnellus 11:00 2 3 107000 147
139277 1 Andrew Perry Usu Radiated shanny 2023-01-17 17:11:16.204+11 21/07/2021 8 NE Jasmin Schuster 0   2023-01-17 17:11:16.204+11 47.72255 -52.83521 {"6":"1"} 2 GB Bauline NFLD13 Ulvaria subbifurcata 11:00 1 3 109000 147
139280 1 Andrew Perry dez No Debris found 2023-01-17 17:11:16.204+11 21/07/2021 8 NE Jasmin Schuster 0   2023-01-17 17:11:16.204+11 47.72255 -52.83521 {} 2 GB Bauline NFLD13 Debris - Zero 11:00 0 3 112000 147
139286 2 Andrew Perry Usu Radiated shanny 2023-01-17 17:11:16.204+11 21/07/2021 8 NE Jasmin Schuster 0   2023-01-17 17:11:16.204+11 47.72255 -52.83521 {"3":"1","4":"1","5":"1","6":"1"} 2 GB Bauline NFLD13 Ulvaria subbifurcata 11:00 4 3 118000 147
139287 2 Andrew Perry pgu Rock gunnel 2023-01-17 17:11:16.204+11 21/07/2021 8 NE Jasmin Schuster 0   2023-01-17 17:11:16.204+11 47.72255 -52.83521 {"5":"1","6":"1","7":"1"} 2 GB Bauline NFLD13 Pholis gunnellus 11:00 3 3 119000 147
139289 2 Andrew Perry dez No Debris found 2023-01-17 17:11:16.204+11 21/07/2021 8 NE Jasmin Schuster 0   2023-01-17 17:11:16.204+11 47.72255 -52.83521 {} 2 GB Bauline NFLD13 Debris - Zero 11:00 0 3 121000 147
139291 0 Andrew Perry Tad Cunner 2023-01-17 17:11:16.204+11 21/07/2021 8 NE Jasmin Schuster 0   2023-01-17 17:11:16.204+11 47.72255 -52.83521 {"6":"5","7":"5"} 0 GB Bauline NFLD13 Tautogolabrus adspersus 11:00 10 3 123000 147
utas-raymondng commented 6 months ago

From the staged job log it clearly indicate only 52 rows inserted

SELECT * FROM nrmn.staged_job_log where staged_job_id = 147 ORDER BY id ASC

id details event_time event_type staged_job_id survey_id summary filter_set
476 RLS CANADA_2021.xlsx uploaded by:antonia.cooper@utas.edu.au 2023-01-17 17:11:15.99+11 UPLOADED 147 NULL NULL NULL
477 Saved 123 rows. Removed 0 invalid rows. 2023-01-17 17:11:16.052+11 STAGED 147 NULL NULL NULL
478 Source file saved to "nrmn-prod/raw-survey/jobid-147.xlsx" 2023-01-17 17:11:16.184+11 STAGED 147 NULL NULL NULL
486 NULL 2023-01-18 12:42:43.127+11 INGESTING 147 NULL NULL NULL
487 52 rows of data4 sites5 surveys12 distinct observable items2 divers 2023-01-18 12:43:11.379+11 INGESTED 147 NULL NULL NULL
utas-raymondng commented 6 months ago

From the above Liz mentioned the survey 923401379 having

visibility = 3.0 site code = NFLD13 depth = 8 program = RLS date = 2021-07-21 survey number = 0

and the loster mentioned are

2 923401379 JMS NFLD13 Bauline 47.72255 -52.83521 21/07/2021 3.0 NE 11:00 8.0 0 0 Homarus americanus American lobster
3 923401379 JMS NFLD13 Bauline 47.72255 -52.83521 21/07/2021 3.0 NE 11:00 8.0 2 1 Homarus americanus American lobster
4 923401379 JMS NFLD13 Bauline 47.72255 -52.83521 21/07/2021 3.0 NE 11:00 8.0 2 2 Homarus americanus American lobster

now the observable item id from this sql is 3680

SELECT * FROM nrmn.observable_item_ref where observable_item_name = 'Homarus americanus' ORDER BY observable_item_id ASC

3680    "Homarus americanus"    1   156134  1   "American lobster"      "Arthropoda"    "Malacostraca"  "Decapoda"  "Nephropidae"   "Homarus"   "americanus"                true    "{""MaxLength"": 64.0}"         4502
utas-raymondng commented 6 months ago

Now if we select the observation audit table with the observable item id of Homarus americanus mentioned above

-- Homarus amaericanus and drive JM and method, block (0,0) (2,1) (2,2) SELECT * FROM nrmn.observation where observable_item_id = 3680 and diver_id = 239 and survey_method_id in (148389,148391,148393) ORDER BY observation_id ASC

observation_id survey_method_id diver_id observable_item_id measure_id measure_value observation_attribute
2453803 148389 239 3680 7 1 NULL
2453812 148391 239 3680 3 1 NULL
2453813 148391 239 3680 4 1 NULL
2453832 148393 239 3680 3 1 NULL

-- Then we select the audit table of the observable_id SELECT * FROM nrmn.observation_aud where observation_id in (2453803, 2453812, 2453813, 2453832) ORDER BY observation_id ASC, rev ASC

observation_id rev revtype measure_value measure_value_mod observation_attribute observation_attribute_mod diver_id diver_mod measure_id measure_mod observable_item_id observable_item_mod survey_method_id survey_method_mod
2453803 2427 0 1 True NULL False 239 True 7 True 3680 True 148389 True
2453812 2427 0 1 True NULL False 239 True 3 True 3680 True 148391 True
2453813 2427 0 1 True NULL False 239 True 4 True 3680 True 148391 True
2453832 2427 0 1 True NULL False 239 True 3 True 3680 True 148393 True

They are all newly added, unfortunately there is no info about where it come from, from ingest or update via UI

bpasquer commented 6 months ago

This query which would list a record if the survey had been corrected returns a null result.


select * from staged_job_log where details like '%923401379%' 

The Homarus records must have been ingested 
utas-raymondng commented 6 months ago
select * from staged_job_log where details like '%923401379%' 

It return empty row for this query

bpasquer commented 6 months ago

Sorry, my message wasn't clear. Here is a summary of the issue and our findings using the Homarus americanus example

These elements indicate that the records were added to the database as part of ingest job 147, yet they are missing in the ingested file

bpasquer commented 2 months ago

The RLS_Canada surveys have been successfully ingested.

Aslo it's been agreed that a new audit_table for staged row will be created to record activity in the UI during ingest/correction.