Closed hillarymarler closed 1 month ago
Example data set that will fail this test:
df <- TADA_DataRetrieval(startDate = "2006-07-17",
endDate = "2006-07-18",
statecode = "DE")
Additional data sets that cause test failures for testing:
df2 <- TADA_DataRetrieval(startDate = "2023-02-14",
endDate = "2023-02-15",
statecode = "CO")
df3 <- TADA_DataRetrieval(startDate = "2010-11-30",
endDate = "2010-12-01",
statecode = "AL" )
@wokenny13 - I think the extra rows are being added in situations where records from the same organization are being identified as duplicates in TADA_FindPotentialDuplicatesMultipleOrgs. And is a result of updates made to TADA_FindNearbySites.
I am also trying to take a look into this.
I ran TADA_FindPotentialDuplicatesMultipleOrgs and TADA_FindPotentialDuplicatesSingleOrg with the 1st df example.
The number of rows increased only for TADA_FindPotentialDuplicatesMultipleOrgs in which 25 were potentially identify which coincides with the number of rows that were increased.
TADA_FindPotentialDuplicatesSingleOrg identfiies potential duplicates of 44 results, but did not add additional rows in the 1st df example
I think the issue may be here:
# get rid of results with no site group added - not duplicated spatially
dupsites <- subset(dupsites, !dupsites$TADA.MonitoringLocationIdentifier %in% c("No nearby sites")) %>%
tidyr::separate_rows(TADA.MonitoringLocationIdentifier, sep = ",")
As a result of changes to TADA_FindNearbySites
Values of logical values of NA were found in .data for TADA.MonitoringLocationIdentifier whereas values in dupsdat for TADA.MonitoringLocationIdentifier were character "NA".
typeof(dupsdat$TADA.MonitoringLocationIdentifier) [1] "character" df_nearby_sites_test <- TADA_FindNearbySites(df_ex) [1] "No nearby sites detected using input buffer distance." typeof(df_nearby_sites_test$TADA.MonitoringLocationIdentifier) [1] "logical"
Inserting this in line 1278 under # connect back to original dataset may be a solution
dplyr::mutate( TADA.MonitoringLocationIdentifier = ifelse(TADA.MonitoringLocationIdentifier %in% NA, "NA", TADA.MonitoringLocationIdentifier)) %>%
Unless there is a preferred variable type that would like to be converted to within the TADA_FindNearbySites() function.
We have noticed occasional failures of this test, although TADA_FindPotentialDuplicatesMultipleOrgs has not been edited recently.
The solution for this issue will require finding example data sets which cause this failure and modifying TADA_FindPotentialDuplicatesMultipleOrgs to address those scenarios.