aodn / nrmn-application

A web application for collation, validation, and storage of all data obtained during surveys conducted by the NRMN
GNU General Public License v3.0
4 stars 3 forks source link

Ingest not staging rows where total = zero #1315

Closed LizziOh closed 6 months ago

LizziOh commented 1 year ago

The ingest staging process is removing rows where total = 0, however I thought this was not supposed to be the case? In the QA/QC documentation the validation for Debris zero states: "If Debris- Zero recorded, then Total and ‘Inverts’ = “0”

for example currently job 296 staged 470 rows, removing 18 rows of debris zero observations. using the file

RLS_BarkleySound_2023_Claire&Kieran.xlsx

bpasquer commented 1 year ago

Looks like the Debris - Zero have been truncated following the end of the sheet rule: https://github.com/aodn/backlog/issues/3421 although they are present everywhere in the sheet.

LizziOh commented 1 year ago

yes, I've noticed this occurring for normal (non-debris) observations where divers forgotten to enter a number as well, which is more concerning than the debris-zero cases. I pick this up manually (I always cross-check that the correct number of rows stage against the original file before proceeding), but since recently ingesting more RLS surveys I've noticed it is frequent in that program's data.

utas-raymondng commented 11 months ago

This change removed the duplicated rows for DEBRIS - ZERO

It work like this, remove duplicated rows where total is zero and those row count > 3 and it falls under "SURVEY NOT DONE", "DEBRIS - ZERO", "NO SPECIES FOUND"

bpasquer commented 11 months ago

@utas-raymondng The rule to eliminate duplicate rows where Debris=0 (which also applies to Survey Not Done and No Species Found) should only be applied at the end of the spreadsheet.

utas-raymondng commented 10 months ago

@utas-raymondng The rule to eliminate duplicate rows where Debris=0 (which also applies to Survey Not Done and No Species Found) should only be applied at the end of the spreadsheet.

https://github.com/aodn/backlog/issues/5205

LizziOh commented 10 months ago

Hi @utas-raymondng , that link doesn't work for me (i guess because its backlog).

I have checked about your blocking rule for Debris zero observations and they should be allowed to be on both block 1 and block 2.

LizziOh commented 10 months ago

For this issue, regarding the rule to truncate the end of the ingest sheet: The truncate rule should check that species_name, method, block, site, date, depth, diver matches row above but total = zero. ? There should be no other rules that prevent rules that prevent staging of total = zero.

bpasquer commented 10 months ago

I have checked about your blocking rule for Debris zero observations and they should be allowed to be on both block 1 and block 2.

Actually @LizziOh , look at this issue: https://github.com/aodn/nrmn-application/issues/1156.

LizziOh commented 10 months ago

Actually @LizziOh , look at this issue: #1156.

Oh gosh that was a mistake! I've checked the M12 endpoint there are 2221 surveys with block 1 for debris and 2133 with block 2 - so they are about equal and it makes sense that you can survey debris on both blocks.

bpasquer commented 10 months ago

Oh gosh that was a mistake! I've checked the M12 endpoint there are 2221 surveys with block 1 for debris and 2133 with block 2 - so they are about equal and it makes sense that you can survey debris on both blocks.

Yes I don't know why we decided that.But we must have checked that with Toni. But yes , i checked the DB , and there is a mix of records on B1 and B2, up until we implemented that change Looking at meeting minutes to find more details...

bpasquer commented 6 months ago

Issue with staging rows with total =0 is resolved. Issue with rule for Debris will be treated separately in https://github.com/aodn/nrmn-application/issues/1346

Closing