r-hyperspec / hySpc.dplyr

Interface between hyperSpec and dplyr
https://r-hyperspec.github.io/hySpc.dplyr/
MIT License
5 stars 1 forks source link

Generalizing transmute.R to other columns that contain matrices #12

Closed eoduniyi closed 4 years ago

eoduniyi commented 4 years ago

At the moment, transmute.hyperSpec uses eval to handle arithmetic operations on$spc, but other columns in the @data slot with full matrices in the column/row not so much. Initially, the solution for dealing with $spc came out of observing:

dplyr::transmute(.data@data, spc = spc*2) # throws an error
.data@data$spc <- .data@data$spc*2 # works as expected!

However, I am a) not sure how one even gets a matrix in a column or b) how to deal with non-spc columns that contain matrices...please help

eoduniyi commented 4 years ago

This may also be noteworthy, within the dplyr package transmute is defined in mutate.R. So, maybe it makes sense to have transmute.R and mutate.R...or perhaps we just have mutate.R where transmute is also defined.

bryanhanson commented 4 years ago

This diagram helps me a lot (from hyperSpec.pdf vignette). It’s not done too often but yes, R allows a matrix or a lot of things for a “column”. hyperSpec allows any number of additional columns (x and y in the diagram, there can be other columns which give per sample (row) info).

On May 13, 2020, at 6:49 PM, Erick Oduniyi notifications@github.com wrote:

At the moment, transmute.hyperSpec uses eval to handle arithmetic operations on$spc, but other columns in the @data slot with full matrices in the column/row not so much. Initially, the solution for dealing with $spc came out of observing:

dplyr::transmute(.data@data, spc = spc2) # throws an error .data@data$spc <- .data@data$spc2 # works as expected However, I am a) not sure how one even gets a matrix in a column or b) how to deal with non-spc columns that contain matrices...please help

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cbeleites/hyperSpec.tidyverse/issues/12, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABCIPUJHLSRSYKDS46CO23RRMPYJANCNFSM4NAFS6WQ.

eoduniyi commented 4 years ago

Okay!

According to our conversation, the generalized mutate/transmute allows and disallows the following:

# Is allowed:
hyperSpec.obj %>%
    mutate (x = y, x2 = y2) # allowed
hyperSpec.obj %>%
    mutate (c = c*2, c = c*0) # allowed
hyperSpec.obj %>%
    mutate (y, x, filename, spc2 = spc*2) # allowed
hyperSpec.obj %>%
    mutate(spc2 = spc*2) %>%
    mutate(spc2) %>%
    mutate(spc2*2) # allowed

# Let a and b be columns with row matrices, then
hyperSpec.obj %>%
    mutate (a = a*0, a = a*2, a = a*3, b) # allowed
hyperSpec.obj %>%
    mutate (a*0, a*2, a*3, b) # allowed

# Is not allowed:
hyperSpec.obj %>%
    mutate (y, x, filename, spc = spc*2) # not allowed
hyperSpec.obj %>%
    mutate (spc*2) # not allowed
hyperSpec.obj %>%
    mutate(spc2 = spc*2) %>%
    mutate(spc) # not allowed    

Note: transmute works in these cases as well except if $spc is not present in transmutation a data frame is returned (i.e., transmute (x, y) # => df )

cbeleites commented 4 years ago

Sorry for the delay. I don't understand why the "not allowed" examples should not be allowed? After all, mutate() will not drop an existing $spc column in these cases.

mutate(-spc) would be different, and I think we may treat that consistently with transmute() i.e. returning a data frame in that case. my error in thinking...

eoduniyi commented 4 years ago

From my understanding, you should not be able to modify the original $spc, but I see what you're saying: mutate (-spc) or mutate (spc*2) would just add another column.

cbeleites commented 4 years ago

yes, mutate (-spc) or mutate (spc*2)add new columns, so no problem at all.

mutate (spc = -spc) is also fine, it does a calculation on $spc. One textbook example would be:

spc %>%
   mutate(spc = -log10(spc)) %>%
   setLabels(spc = "A")

for transforming transmission spectra into absorbance.

That's a transformation that is very often needed.

cbeleites commented 4 years ago

This issue has just become almost a non-issue since dplyr 1.0.0 got published and it natively works with columns containing matrices! :tada:

eoduniyi commented 4 years ago

This issue has just become almost a non-issue since dplyr 1.0.0 got published and it natively works with columns containing matrices! 🎉

Going to close this then.