HakaiInstitute / hakai-ctd-qc

Series of tests applied to the Hakai CTD profile data based on the QARTOD tests and other Hakai Specific ones.
0 stars 0 forks source link

Review distance_from_station test thresholds #29

Open JessyBarrette opened 1 year ago

JessyBarrette commented 1 year ago

distance_from_station is used to test if a drop is within an acceptable range of the related reference station. This test is defined within the views https://github.com/HakaiInstitute/hakai-database/blob/main/SQL/ctd/ctd-views.sql

Those threshold used are potentially too large :

Let's review those thresholds the following postgresql command retrieves the amount of profiles flagged as suspect/fail with the actual range thresholds:

SELECT
    *
FROM
    (
    SELECT
        ORGANIZATION,
        STATION,
        SUM(
        CASE
            WHEN DISTANCE_FROM_STATION >1000 AND DISTANCE_FROM_STATION <4000 THEN 1
            ELSE 0
        END
    ) AS SUSPECT_RANGE,
        SUM(
        CASE
            WHEN DISTANCE_FROM_STATION >= 4000 THEN 1
            ELSE 0
        END
    ) AS FAIL_RANGE,
        COUNT('HAKAI_ID') AS TOTAL_PROFILES
    FROM
        CTD.CTD_FILE_CAST X
    GROUP BY
        ORGANIZATION,
        STATION
) AS X
WHERE
    TOTAL_PROFILES > 2
    AND (SUSPECT_RANGE>0
        OR FAIL_RANGE>0)
With the following results: organization station suspect_range fail_range total_profiles
HAKAI BU1 0 4 63
HAKAI BU2 0 6 62
HAKAI BU3 0 4 57
HAKAI BU4 0 4 66
HAKAI BU5 0 4 61
HAKAI BU6 0 4 60
HAKAI BU7 0 5 62
HAKAI BU8 0 3 63
HAKAI BUR2 3 0 10
HAKAI BUR3 1 3 17
HAKAI BUR4 0 4 10
HAKAI BUR5 0 4 11
HAKAI BUR6 0 4 10
HAKAI BUR7 0 4 10
HAKAI BUR8 0 3 17
HAKAI D10 4 0 41
HAKAI D27 7 0 45
HAKAI D35 1 0 11
HAKAI DAWSONS 0 1 112
HAKAI DFO1 0 2 99
HAKAI DFO2 0 2 175
HAKAI DFO3 0 2 76
HAKAI DFO4 0 1 52
HAKAI DFO5 0 2 62
HAKAI DI13 1 0 30
HAKAI DI18 2 0 13
HAKAI FZH01 6 0 201
HAKAI FZH04 0 1 20
HAKAI FZH08 1 1 107
HAKAI FZH13 0 1 71
HAKAI FZH14 2 2 56
HAKAI HKP01 0 2 136
HAKAI HKP03 0 1 92
HAKAI HKP04 0 1 106
HAKAI HKP05 2 1 50
HAKAI HKP06 1 2 56
HAKAI J03 1 0 22
HAKAI JS12 1 2 81
HAKAI KC10 0 1 234
HAKAI KC11 0 1 138
HAKAI KC12 0 2 121
HAKAI KC13 2 0 121
HAKAI KC14 0 1 114
HAKAI KC15 0 1 135
HAKAI KC16 0 2 114
HAKAI KC17 0 2 144
HAKAI KC4 3 0 137
HAKAI KC7 0 2 145
HAKAI KC8 0 1 137
HAKAI KELP10 0 1 27
HAKAI KFPS08 1 0 50
HAKAI KWY01 0 3 131
HAKAI KWY02 0 2 104
HAKAI KWY03 8 0 91
HAKAI MACRO1 0 1 49
HAKAI MACRO11 0 2 10
HAKAI MACRO6 0 1 21
HAKAI MACRO8 0 1 8
HAKAI MEA01 3 0 104
HAKAI MEA02 3 0 112
HAKAI MEA03 0 2 110
HAKAI MEA04 0 2 125
HAKAI PRUTH 10 1 1531
HAKAI QCS01 1 1 166
HAKAI QCS07 0 1 45
HAKAI QCS08 4 0 26
HAKAI QSD03 0 1 12
HAKAI QU16 0 1 180
HAKAI QU19 0 1 18
HAKAI QU20 0 1 185
HAKAI QU3 0 2 190
HAKAI QU33 0 2 93
HAKAI QU36 0 3 83
HAKAI QU37 0 1 41
HAKAI QU38 1 2 306
HAKAI QU39 1 8 601
HAKAI QU43 0 5 163
HAKAI QU5 2 1 434
HAKAI QU9 2 0 118
HAKAI RVRS01 0 2 127
HAKAI RVRS02 1 2 59
HAKAI SEA15 1 0 15
HAKAI SEA6 1 0 6
HAKAI SHORE21 0 1 6
HAKAI TO1 0 4 27
HAKAI TO2A 15 4 24
HAKAI TO3A 0 3 22
HAKAI TO4A 0 1 22
HAKAI TO5 0 2 21
HAKAI UBC7 0 2 111
HAKAI UBC8 0 2 79
NATURE TRUST BELC01 2 2 7
NATURE TRUST BELC02 5 0 8
NATURE TRUST BELC03 1 0 7
NATURE TRUST CLUC06 1 1 45
NATURE TRUST COWC02 1 0 10
NATURE TRUST COWC03 2 0 19
NATURE TRUST COWC09 1 0 27
NATURE TRUST COWC14 1 0 31
NATURE TRUST ENGC04 0 1 40
NATURE TRUST ENGC06 0 1 42
NATURE TRUST ENGC10 0 2 39
NATURE TRUST FULC01 0 2 44
NATURE TRUST FULC03 2 0 33
NATURE TRUST FULC05 1 0 32
NATURE TRUST FULC08 1 0 23
NATURE TRUST FULC10 1 0 19
NATURE TRUST FULC12 1 2 29
NATURE TRUST FULC13 0 1 29
NATURE TRUST GLEC01 1 0 26
NATURE TRUST GLEC06 3 0 11
NATURE TRUST GLEC07 1 0 22
NATURE TRUST KAOC01 1 0 19
NATURE TRUST KAOC14 0 1 24
NATURE TRUST KOEC01 3 0 13
NATURE TRUST KUMC01 6 0 8
NATURE TRUST KUMC02 2 0 25
NATURE TRUST MOYC04 0 2 34
NATURE TRUST MOYC05 0 1 30
NATURE TRUST MOYC06 0 1 33
NATURE TRUST MOYC08 0 1 32
NATURE TRUST MOYC09 0 1 33
NATURE TRUST MOYC10 0 1 35
NATURE TRUST MOYC11 2 1 35
NATURE TRUST NADC07 0 1 6
NATURE TRUST NADC08 1 0 7
NATURE TRUST NANC01 2 0 11
NATURE TRUST NANC02 5 0 35
NATURE TRUST NANC03 1 0 39
NATURE TRUST NANC04 1 0 41
NATURE TRUST NANC05 2 0 15
NATURE TRUST NANC06 5 0 36
NATURE TRUST NANC08 1 0 38
NATURE TRUST NANC09 3 1 45
NATURE TRUST NANC13 2 1 44
NATURE TRUST NANC14 2 0 43
NATURE TRUST NANC15 1 0 32
NATURE TRUST NANC16 1 0 38
NATURE TRUST NANC17 3 0 38
NATURE TRUST NANC18 5 0 36
NATURE TRUST NANC19 2 1 47
NATURE TRUST QUAC09 1 0 38
NATURE TRUST SALC04 0 1 31
NATURE TRUST SALC05 0 2 33
NATURE TRUST SALC06 28 0 36
NATURE TRUST SALC07 1 0 29
NATURE TRUST SALC09 0 2 29
NATURE TRUST SALC10 0 2 30
NATURE TRUST SALC11 0 2 31
NATURE TRUST SALC12 0 1 36
NATURE TRUST SALC13 0 1 34
JessyBarrette commented 1 year ago

@ah-hakai To highlight the range from station issue associated to the ctd data here's a notebook that present all the QU39 data collected so far. and each profile associated location https://colab.research.google.com/drive/1rgchY8wXWj-wIA7txv1F6PhQ83pprM_L?usp=sharing

Drops at the station

image

Drops outside 1km circle radius=1km

image

Those thresholds were decided based on the range variability observed at QU39

fostermh commented 1 year ago

to be clear, the inside circle is 1km and the outside circle is 4km? If would be interesting to see the same map with the actual points (not clustered). I noticed that at least one point appeared to be at the raft when I looked at the data.

JessyBarrette commented 1 year ago

Yes 4km for the outer and 1km for the inner one.

Looks like one drop is at the raft yes ;)

image

You can review the interactive figure on the notebook. Scroll down you'll see that map and you can then zoom in and out the clusters will open up. https://colab.research.google.com/drive/1rgchY8wXWj-wIA7txv1F6PhQ83pprM_L?usp=sharing

fostermh commented 1 year ago

thanks, it's interesting to see the distribution Screen Shot 2023-04-19 at 2 36 08 PM Screen Shot 2023-04-19 at 2 37 30 PM

JessyBarrette commented 1 year ago

I added a new notebook that presents all the drops that were flagged with a bad location greater than 1km from the station: https://colab.research.google.com/github/HakaiInstitute/hakai-profile-qaqc/blob/development/notebooks/review_remote_drops.ipynb

image image
JessyBarrette commented 1 year ago

Most of those drops seems to be coming to a few surveys only

image

We can potentially target those surveys and asses with the group how to proceed