TysonStanley / tidyfast

Fast and efficient alternatives to tidyr functions built on data.table #rdatatable #rstats
https://tysonbarrett.com/tidyfast/
187 stars 4 forks source link

Issues with dt_fill() #24

Closed markfairbanks closed 9 months ago

markfairbanks commented 4 years ago

Found a couple issues with dt_fill().

library(tidyfast)
library(data.table)

df <- data.table::data.table(x = c(1, 1, 2), y = c(1, NA, NA), z = c(1, NA, NA))

## No "id" provided
#  Drops columns
dt_fill(df, y)
#>    y
#> 1: 1
#> 2: 1
#> 3: 1

## When using "id"
# Drops columns
# Renames "id" column
# Reorders columns (Might not be worth addressing? Byproduct of using "by" in data.table)
dt_fill(df, y, id = x)
#>    by  y
#> 1:  1  1
#> 2:  1  1
#> 3:  2 NA

I started to fix the renaming and reordering issues when I ran into the dropping issue.

The dropping issue is an interesting one as using this fix...

dt_[, (dots) := lapply(.SD, fun), .SDcols = dots]

leads to a modify-by-reference of the original data.table.

I'm guessing the current method was used to avoid modify-by-reference so I didn't make a pull request - I wasn't sure how you'd want to deal with that part

TysonStanley commented 4 years ago

Right. I need to give this more thought. The modify-by-reference aspect can sometimes be fantastic but other times a challenge. Thanks for pointing it out.

By the way, I've been really impressed with all your work on tidytable. I think a lot of people find it really helpful.

markfairbanks commented 4 years ago

Thanks, I appreciate it. Slowly but surely it's getting there

TysonStanley commented 9 months ago

Thanks, this is old but it was fixed previously. Closing now.