dbca-wa / wastdr

An R Client for the WA Sea Turtles and Strandings Database WAStD
https://dbca-wa.github.io/wastdr/
Other
3 stars 0 forks source link

Reconstruct W2 coordinates from location/place code #67

Open florianm opened 2 years ago

florianm commented 2 years ago

Problem

WAMTRAM2 (W2) data are referenced both by location code (~ WAStD Locality) and place code (~ WAStD Site) as well as latitude / longitude (with missing values and lots of typo errors placing them outside of the WAStD Site, therefore breaking WAStD's georeferencing).

Currently, many W2 records fall outside WAStD sites and would not show up in analyses correctly. Exactly 4581 W2 records have no coordinates, but have a location and place code. If these are not imported into WAStD (as is the current case), they are missing from case histories and can lead to false heuristic - wrong classification into new tagging/inter season returns/intra season remigrant etc.

Since we can't manually fix 4581 records plus many typos, we need a systematic approach that is still analytically valid.

Approach

Link WAStD Sites to WAMTRAM location and place code

View both on one map:

map_wastd_wamtram_sites(wastd_data, wamtram_data)

An interactive map shows WAStD sites/localities and WAMTRAM places. image

The WAStD areas and sites have an edit link in the popup. image

The link allows to annotate the WAStD site with the W2 location and place codes. image

Some W2 places have no WAStD Site yet, some WAStD sites contain multiple W2 places. Some W2 locations relate to several WAStD localities or vice versa. It's a messy situation. We don't need a perfect match, we only need a complete match of W2 places with actual data (tagged/stranded/encountered turtles) with WAStD sites and localities.

We can link one or several W2 places to the WAStD site they fall into (or are closest to) by entering their place code into the WAStD site's field W2 Place Code. We can then match WAStD sites to W2 records by place code.

The following W2 sites need to be linked to WAStD sites:

# Link to existing or create new WAStD localities:
R> unique(w2_data$enc$location_code %>% unique() %>% sort(), w2_data$enc_qa$location_code %>% unique() %>% sort()) %>% cat()
AI AR BR BW CD CL DA DH EG EI EM GA KS LA LO MB MI MN MU NI NK NW PB PE PH RI SB SC SE SR SW TH VA WC WK

# Link to existing or create new WAStD sites for:
unique(w2_data$enc$place_code %>% unique() %>% sort(), w2_data$enc_qa$place_code %>% unique() %>% sort()) %>% cat()
AIBT AIEL AIWR AREI ARMI ARWI ARWM BRWS BWA7 BWBB BWBV BWCB BWDB BWEN BWFB BWIN BWJB BWJW BWMI BWMU BWOB BWSP BWTA BWTB BWTO BWTR BWTT BWWB BWYC BWYN BWYS CDTN CLBB CLCV DADI DAEI DAEL DALI DHCL DHT2 DHT3 DHT4 DHT5 DHTB DHTx DHTX EGBU EGEI EGGB EGGC EGPT EGSP EITB EM20 EM30 EMA5 EMCP EMPD GACB GAGN GASP GAWR GCAI GCBO GCMI INAI INJS INMR KSAI KSBR KSHI KSJI KSLI KSOA KSPI KSSI KSTI LAMI LAWI LOAB LOBI LOBR LOPK LOSA LOSB LOTA MBHW MBNW MBSE MBTI MINN MISS MISW MIWW MNCB MNCW MUEB MULB MUNB MUNE MUNI MUW1 MUW2 MUW3 MUW4 MUW5 MUW6 MUW7 MUW8 MUWB NIBN NIBU NIMB NINB NINC NIRO NISB NKAI NKBI NKCH NKCI NKEM NKGI NKLA NKLI NKMI NKMS NKNB NKPG NKSI NKTI NTCB NTCD NTCI NTEI NTII NTMA NTMI NTML NTNG NTNP NTPB NTRR NTSG NTVI NWBA NWBR NWBS NWCP NWFL NWFM NWGY NWHU NWJC NWJN NWJR NWKC NWLH NWMA NWMS NWSB NWTA NWTB NWTD NWTG NWTR NWTU NWWO PBAI PBBB PBBI PBCL PBLI PBNB PBON PBPT PECO PECS PEFB PEFM PEGI PEHH PELB PEML PEMP PEQN PERI PESB PESR PEWS PEYP PHCB PHPP PHSP QDTS RICB RICE RICN RICW RINB RISx RIW1 RIW2 RIW3 RIW4 RIW5 RIW6 RIW7 RIW8 SBBB SBCB SBCP SBHB SBHP SBLS SBMM SBMP SBNC SBPT SBSP SBST SCAL SCAU SCBB SCCR SCES SCFC SCMA SCPC SCWH SRSI SWBY SWCL SWCN SWMH THEE THNB VAAM VAAN VACB VAHB VAMB VANM VAPB VASM VATB WCAI WCCH WCCV WCDO WCGB WCGN WCJR WCKB WCLM WCLP WCWB WCWI WKBB WKBM WKCA WKCB WKDB WKGB WKJP WKPB WKPI WKRB WKRK WKTB WKYG

Progress - so far we have

R> sites$w2_place_code %>% sort() %>% cat()
BWBB BWJW BWOB BWSP BWTB BWTO BWWB CLBB CLCV DADI EGBU GACB GAGN GARB NIBH NIBN NIBS NICN NIJB NIRO NISP NWBA NWCP NWFM NWHU NWJC NWJN NWLH NWMS NWSB NWTA NWTB NWTD NWTR 
RIB1 RIB2 RIB3 RICE RICN RICW RINB RIW6 RIW7 THEE THNB WKYG

Left to do (above list minus progress)

AIBT AIEL AIWR AREI ARMI ARWI ARWM BRWS 
BWA7 BWBV BWCB BWDB BWEN BWFB BWIN BWJB  BWMI BWMU BWTA BWTR BWTT BWWB BWYC BWYN BWYS 

CDTN
DAEI DAEL DALI 
DHCL DHT2 DHT3 DHT4 DHT5 DHTB DHTx DHTX 

EGEI EGGB EGGC EGPT EGSP EITB 

EM20 EM30 EMA5 EMCP EMPD 

GASP GAWR GCAI GCBO GCMI INAI INJS INMR KSAI KSBR KSHI KSJI KSLI KSOA KSPI KSSI KSTI LAMI LAWI LOAB LOBI LOBR LOPK LOSA LOSB LOTA MBHW MBNW MBSE MBTI MINN MISS MISW MIWW MNCB MNCW MUEB MULB MUNB MUNE MUNI MUW1 MUW2 MUW3 MUW4 MUW5 MUW6 MUW7 MUW8 MUWB NIBN NIBU NIMB NINB NINC NIRO NISB NKAI NKBI NKCH NKCI NKEM NKGI NKLA NKLI NKMI NKMS NKNB NKPG NKSI NKTI NTCB NTCD NTCI NTEI NTII NTMA NTMI NTML NTNG NTNP NTPB NTRR NTSG NTVI NWBA NWBR NWBS NWCP NWFL NWFM NWGY NWHU NWJC NWJN NWJR NWKC NWLH NWMA NWMS NWSB NWTA NWTB NWTD NWTG NWTR NWTU NWWO PBAI PBBB PBBI PBCL PBLI PBNB PBON PBPT PECO PECS PEFB PEFM PEGI PEHH PELB PEML PEMP PEQN PERI PESB PESR PEWS PEYP 

PHCB PHPP PHSP 

QDTS 

RICB RICE RICN RICW RINB RISx RIW1 RIW2 RIW3 RIW4 RIW5 RIW6 RIW7 RIW8 

SBBB SBCB SBCP SBHB SBHP SBLS SBMM SBMP SBNC SBPT SBSP SBST SCAL SCAU SCBB SCCR SCES SCFC SCMA SCPC SCWH SRSI 

SWBY SWCL SWCN SWMH 

VAAM VAAN VACB VAHB VAMB VANM VAPB VASM VATB 

WCAI WCCH WCCV WCDO WCGB WCGN WCJR WCKB WCLM WCLP WCWB WCWI 

WKBB WKBM WKCA WKCB WKDB WKGB WKJP WKPB WKPI WKRB WKRK WKTB

Reconstruct missing or misplaced coordinates from corresponding WAStD Sites

In download_w2_data, before splitting obs into enc and enc_qa

Factor this into a function taking values from involved columns as data.