AquaAuma / FishGlob_data

Database and methods related to the manuscript "An integrated database of fish biodiversity sampled with scientific bottom trawl surveys"
Creative Commons Attribution 4.0 International
21 stars 7 forks source link

summary for Northern GSL #36

Closed AquaAuma closed 8 months ago

AquaAuma commented 9 months ago

the summary is not working, and I cannot seem to find the issue because my R session aborts every time I try to investigate

AquaAuma commented 9 months ago

same for running the flagging functions in the R code Problem L451 in the apply_trimming_method1.R

AquaAuma commented 9 months ago

image With ddgridR there is an issue in lat/long conversion in creating the grid that end up in the output figures, and probably the survey files (wrong grid cell codes?)

error @jepa Juliano is getting:

######### Apply trimming per survey_unit method 1
#apply trimming for hex size 7
dat_new_method1_hex7 <- apply_trimming_per_survey_unit_method1(clean_GSLnor, 7)

FATAL ERROR: DgQ2DDtoIConverter::convertTypedAddress():  coordinate out of range: (51, 81)
AquaAuma commented 8 months ago

@jepa this is the code line where we are getting the issue https://github.com/AquaAuma/FishGlob_data/blob/0cd5dff3a2796e33f4e3b3a506ffb91aec34fbd0/functions/apply_trimming_method1.R#L34

I am thinking it might be an R spatial package problem with versions (https://cran.r-project.org/web/packages/dggridR/index.html) + a specific problem with GSL-N

jepa commented 8 months ago

Weird. I didn't had the error at that level. Mine happens in line 55...

https://github.com/AquaAuma/FishGlob_data/blob/0cd5dff3a2796e33f4e3b3a506ffb91aec34fbd0/functions/apply_trimming_method1.R#L55C13-L55C13

Also, how do you make it so that I can see the line!?!?!?

jepa commented 8 months ago

If I interpret the error correctly, it is saying that there is A) an issue with the the coordinates (latitude 51, longitude 81) or with latitude 51 and 81 on the unique_latlon data. I've explored this and I can't seem to find an obvious issue o=with latitude 51 (there is no longitude nor latitude 81). That being said... It has nothing to do with those points in the original map that are not supposed to be there. So I think this is not the actual issue

Screenshot 2023-11-28 at 10 57 49 AM
jepa commented 8 months ago

Again... these are 6 examples of coordinates that give such error ... I don't see any obvious pattern. Regardless, the data is still there after the dggridR::dgGEO_to_SEQNUM function on line 82, so I think we should be fine?

latitude longitude_s 1 47.85833 -60.03333 2 47.76333 -60.54833 3 47.79833 -60.67167 4 47.87667 -60.68667 5 47.95667 -60.69667 6 48.11667 -61.32167

Screenshot 2023-11-28 at 11 12 16 AM
AquaAuma commented 8 months ago

I don't know, it just displays for me :) Yes, I think your are right, it's the same error I don't get it

AquaAuma commented 8 months ago

Nothing seems wrong with the actual coordinates rights? They're all continuous values

jepa commented 8 months ago

No, is all good. I traced down the (in)famous problem to line L62 when it estimates the center of the cells with the dggridR::dgSEQNUM_to_GEO function.

It only appears on line L540 (#Base map with empty cells). Specifically L553 because is selecting cell_center_longitude_s and cell_center_latitude which output these these points... I have no idea why those center cells are even computed

cell_center_longitude_s cell_center_latitude cell nyear

1 -142. 23.2 1389 39 2 -141. 20.9 1469 38 Screenshot 2023-11-28 at 11 41 00 AM
AquaAuma commented 8 months ago

yeah, so strange. Do you know if the code is assigning these two grid cells to long lats from the survey data that seem to be in the Pacific? I'm wondering if some long lats have an issue with the function

jepa commented 8 months ago

So... I am not familiar with the function so i did a "dirty fix". Basically I limited the data to 1 degree lat/long of the original dataset as follows on line L65:

# Linking cell centers to unique_latlon ----

  unique_latlon <- unique_latlon %>%
    dplyr::mutate(cell_center_longitude_s = cellcenters$"lon_deg") %>%
    dplyr::mutate(cell_center_latitude = cellcenters$"lat_deg") %>% 
    # Quick fix for issue #36
    # Restrain center data to 1 degree beyond the original survey
    # JEPA: November 28
    filter(
      cell_center_latitude >= min(latitude-1),
      cell_center_latitude <= max(latitude+1),
      cell_center_longitude_s >= min(longitude_s-1),
      cell_center_longitude_s <= max(longitude_s+1)
      )

This way, when the dggridR::dgSEQNUM_to_GEO function goes bananas, we remove those cell_center_latitude that are off. Note that I made an arbitrary decision to keep all cell_center_latitude and cell_center_longitude within 1 degree of the max/min of original lat/long, but this can be changed.

Screenshot 2023-11-28 at 11 53 58 AM
jepa commented 8 months ago

This is how the protocol looks like once fixed. Questions? Thoughts? Concerns? "This-is-morelike-a-commet"?

Screenshot 2023-11-28 at 12 09 42 PM
jepa commented 8 months ago

Quick fix is dropping data so no good.

AquaAuma commented 8 months ago

this is what it should look like GSL-N_hex_res_7_map_per_haul_grid_nyears

jepa commented 8 months ago

Sooooooo... THIS-IS-JUST-GREAT... Turns out the problem with dggridR::dgGEO_to_SEQNUM() is related to dggridR V 3.00 and MAC, specifically to those with M1 or >M1 chips, which is my case (and I am guessing yours too @AquaAuma ). I ran the script in the lab's PC and it worked-just-F-fine... This is the output... hashtagFacePalm

Screenshot 2023-11-28 at 4 07 35 PM
jepa commented 8 months ago

There is also a fix for MAC users see here. Basically, it fixes the issue if you update the dggridR package to 3.1.0 using the source code remotes::install_github("SebKrantz/dggridR"). I did that and it ran smoothly on my mac. I've gone ahead and pushed 10a0d62 the updated outputs

AquaAuma commented 8 months ago

wow ok, thank you so much!! I'm glad we found what the problem was... feel free to push, it looks right