ices-eg / ICES-VMS-and-Logbook-Data-Call

GNU General Public License v3.0
5 stars 5 forks source link

create_trip_id #26

Open einarhjorleifsson opened 6 months ago

einarhjorleifsson commented 6 months ago

I am a bit confused with create_trip_id. as currently defined in 0_global.R we have:

# Define a function to create a unique trip identifier
create_trip_id <- function(eflalo) {
  paste(eflalo$LE_ID, eflalo$LE_CDAT, sep="-")
}

now these variables are associated with a "Log event" with the meaning:

i.e. these variables are not associated with trip identification. should we not be using the FTREF and some of the associated FT** date-time variables?

keep in mind that this eflalo format is a bit alien to me, so this may actually be a bug in my head.

neilcampbelll commented 5 months ago

That's a good point. The code here is just a reworking of what was in the previous workflow script so that this...

 # 2.3.3  Remove non-unique trip numbers -----------------------------------------------------------------------------

   eflalo <-
    eflalo[
      !duplicated(paste(eflalo$LE_ID, eflalo$LE_CDAT, sep="-")),
    ]
  remrecsEflalo["duplicated",] <-
    c(
      nrow(eflalo),
      100 +
        round(
          (nrow(eflalo) - as.numeric(remrecsEflalo["total", 1])) /
            as.numeric(remrecsEflalo["total", 1]) * 100,
          2)
)

is replaced by this

  # Apply the trip ID function to the eflalo data frame
  trip_id <- create_trip_id(eflalo)

  # Remove records with non-unique trip identifiers
  eflalo <- eflalo[!duplicated(trip_id), ]

I think this is a hangover from very early days of the process, when FT_REF wasn't as unique an identifier as it is supposed to be.