Closed oobianom closed 1 month ago
I believe I understand what you are asking but if you could rephrase your inquiry with a sample dataset, it would be more helpful.
To make it easier for you to assess, i put together a rough draft of the function and updated this repository.
Here is an example, dt = mtcars
dt[dt$mpg == 21.0 & dt$cyl == 6,]$cyl = 1000 dt[dt$mpg == 21.0 & dt$cyl == 6,]$hp = 2000 dt[dt$mpg == 21.0 & dt$cyl == 6,]$vs = dt[dt$mpg == 21.0 & dt$cyl == 6,]$hp*2
mutate_filter(dt,mpg == 21.0 & cyl == 6, cyl=1000,hp=2000,vs=hp*2)
The proposed function you have described does not exist, at least in the way you have described it, in R. Given the additional information you have provided, I have crafted what I believe is a function that meets the requirements you have laid out.
FUNCTION NAME: mutate_filter
TOTAL NUMBER OF FUNCTION ARGUMENTS: 6
ARGUMENT NAMES:
ARGUMENT SUMMARY DESCRIPTION:
OPTIONALITY: Only two arguments are required to execute the function. These arguments are data and f_arg1.
FUNCTION STRUCTURE:
mutate_filter <- function(data, f_arg1, f_arg2, mutcolx, mutcoly, expr) {
if (missing(f_arg2)) {
d1 <- dplyr::filter(data, eval(parse(text = f_arg1)))
} else {
d1 <- dplyr::filter(data, eval(parse(text = f_arg1)), eval(parse(text = f_arg2)))
}
if (missing(mutcolx)) {
quote(expr = )
} else {
eval(parse(text = paste0("d1$", mutcolx)))
}
if (missing(mutcoly)) {
quote(expr = )
} else {
eval(parse(text = paste0("d1$", mutcoly)))
}
if (missing(expr)) {
quote(expr = )
} else {
# Evaluate the expression within the data frame context
calc_fld <- eval(parse(text = expr), envir = d1)
# Add the new field to the data frame
d1$calc_fld <- calc_fld
}
return(d1)
}
FUNCTION TESTING STATUS: The function has been tested but not extensively. If the function meets the expectations provided by the previous explanation and requirements as outlined in this issue, additional testing should be conducted.
If the function does not work as presented, especially consistent with the examples provided, please reach out and I will send the function syntax again. It is possible that the conversion from the R application to this medium did not capture the code syntax correctly.
FUNCTIONAL UTILITY: It is not understood what the value proposition is for the arguments in the function called mutcolx and mutcoly. Consistent with the requirements provided, they were included. However, mutating an entire data field with a single value does not seem to be useful or provide a high level of utility. Adding a second argument that replicates this functionality is also questionable. Unless a compelling reason exists for the inclusion of these arguments, it is strongly recommended that they be removed from the function. The function would then contain a total of (4) arguments, collectively providing what is believed to be an extraordinary value proposition.
One way to improve the utility of the mutate_filter function would be to replace one of the mutcol arguments with an argument that can control the removal of contiguous or non-contiguous variables from the data frame object in the mutated output.
CODE EXAMPLES:
library(DescTools)
data("d.pizza")
data("mtcars")
data("quakes")
mutate_filter(mtcars, f_arg1 = "mpg == 21.0", f_arg2 = "cyl == 6", mutcolx = "cyl = 1000", mutcoly = "hp = 2000", expr = "hp*2")
mutate_filter(d.pizza[,1:10], f_arg1 = "driver == 'Taylor'", f_arg2 = "area == 'Camden'", expr = "count*price")
mutate_filter(mtcars, f_arg1 = "cyl == 8", expr = "vs+am+gear+carb")
mutate_filter(quakes, f_arg1 = "stations == 10", expr = "round(mag/depth,3)")
Thanks Brice. Actually, I don't think we need the secondary arguments since one can easily combine such as "mpg == 21 & cyc == <3"
Does the proposed function I provided meet the requirements you laid out?
From: Obi Obianom @.> Sent: Friday, June 21, 2024 7:48 PM To: oobianom/quickcode @.> Cc: brichard1638 @.>; Comment @.> Subject: Re: [oobianom/quickcode] Function to Mutate a subset of dataset, and reattach it to the dataset (Issue #27)
Thanks Brice. Actually, I don't think we need the secondary arguments since one can easily combine such as "mpg == 21 & cyc == <3"
— Reply to this email directly, view it on GitHubhttps://github.com/oobianom/quickcode/issues/27#issuecomment-2183587413, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ASLI5UN6PXK5NOM2SVIVWMLZIS3VNAVCNFSM6AAAAABJJHUJEWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBTGU4DONBRGM. You are receiving this because you commented.Message ID: @.***>
Based on your latest feedback, I've re-constructed the mutate_filter function in the following ways:
NEW FUNCTION STRUCTURE:
mutate_filter <- function(data, f_arg1, f_arg2, rem = NULL, srtfld = NULL, expr) {
if (missing(f_arg2)) {
d1 <- dplyr::filter(data, eval(parse(text = f_arg1)))
} else {
d1 <- dplyr::filter(data, eval(parse(text = f_arg1)), eval(parse(text = f_arg2)))
}
if (!is.null(rem)) {
d1 <- d1[, -c(rem)]
}
if (missing(expr)) {
quote(expr = )
} else {
# Evaluate the expression within the data frame context
calc_fld <- eval(parse(text = expr), envir = d1)
# Add the new field to the data frame
d1$calc_fld <- calc_fld
}
if (!is.null(srtfld)) {
d1 <- dplyr::arrange(d1, eval(parse(text = srtfld)))
}
return(d1)
}
It is believed that this version of the mutate_filter function possesses a much higher value proposition than its predecessor. As a result, this modified function should be the one selected for inclusion in the quickcode package.
FUNCTION TESTING STATUS: The updated function has been tested but not extensively. If the functional output meets the expectations of the requirements previously outlined in this issue, additional testing should be conducted.
ADDITIONAL NOTES:
CODE EXAMPLES:
library(DescTools)
data("d.pizza")
data("mtcars")
mutate_filter(mtcars, "mpg == 21.0", "cyl == 6", expr = "hp*2")
mutate_filter(d.pizza[,1:10], f_arg1 = "driver == 'Taylor'", f_arg2 = "area == 'Camden'", expr = "count*price")
mutate_filter(mtcars, f_arg1 = "cyl == 8", expr = "vs+am+gear+carb")
mutate_filter(airquality, f_arg1 = "Month == 5", rem = c(3:4), expr = "Ozone/Solar.R")
mutate_filter(d.pizza, f_arg1 = "area == 'Camden'", rem = c(1:4, 15,16), srtfld = "price", expr = "round(count*price,2)")
mutate_filter(mtcars, f_arg1 = "vs == 1", rem = c(2:5, 11), srtfld = "mpg")
mutate_filter(mtcars, f_arg1 = "mpg > 20", rem = 11)
mutate_filter(d.pizza[5:10], f_arg1 = "area == 'Westminster'", srtfld = c("driver", "price"))
CONCLUSION The only thing missing from the code supporting the mutate_filter function is that each argument must be expressly cited or the function will crash. I'm not sure what changes need to be made to the code but argument names when using the function should be optional.
Thanks Brice!
When are you planning on publishing the next version of quickcode?
From: Obi Obianom @.> Sent: Saturday, June 22, 2024 9:28 PM To: oobianom/quickcode @.> Cc: brichard1638 @.>; Comment @.> Subject: Re: [oobianom/quickcode] Function to Mutate a subset of dataset, and reattach it to the dataset (Issue #27)
Thanks Brice!
— Reply to this email directly, view it on GitHubhttps://github.com/oobianom/quickcode/issues/27#issuecomment-2184310365, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ASLI5UMM64DCODJCPFLN6R3ZIYQERAVCNFSM6AAAAABJJHUJEWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBUGMYTAMZWGU. You are receiving this because you commented.Message ID: @.***>
Hi Brice, does such a function already exist?
Basically, with dplyr, I can filter and then do all downstream processes like group_by mutate and so on. But there I need that filtered portion to remain in the entire dataset after the manipulation of that subset.
Let me know if you understand. Else, I can rephrase.