Novartis / xgxr

R package for supporting exploratory graphics at http://opensource.nibr.com/xgx
Other
13 stars 7 forks source link

xgx_scale_y_log10 zero values #52

Closed erbw closed 11 months ago

erbw commented 2 years ago

Describe the bug Unable to show zero values with log scale y-axis using pseudo log transformation (can be done with scale_y_continuous) - code generates error message: Error in scale_y_continuous(..., trans = log10_trans()) : formal argument "trans" matched by multiple actual arguments

To Reproduce category <- rep(c("a","b"),6) val<-c(0,0,rlnorm(10)) df<-data.frame(category,val) ggplot(df,aes(x=category,y=val))+ geom_jitter()+ xgx_scale_y_log10(trans=scales::pseudo_log_trans(base=10))

Expected behavior Plot includes pseudo zero values similar as can be done with scale_y_continuous(trans=scales::pseudo_log_trans(base=10))

Screenshots If applicable, add screenshots to help explain your problem.

Package versions (please complete the following information):

Additional context Add any other context about the problem here.

iamstein commented 2 years ago

If I take your code above and replace xgx_scale_y_log10 with the ggplot2 function scale_y_log10 I get exactly the same error. Therefore, I believe the error is coming from ggplot2 rather than xgxr.

margoal1 commented 2 years ago

The error happens because scale_y_log10 uses a built-in trans of log10, but that's the whole point of the function, so not exactly a bug. I guess the question here is, for our xgx_scale_y_log10, do we want to have the feature that Edward suggested? Plotting zeros in a nice way? We could add an option as input plot zeros nicely true/false, if false use the default log10 trans, if true, use the trans Edward suggested.

On Wed, May 11, 2022, 9:50 PM Andrew Stein @.***> wrote:

If I take your code above and replace xgx_scale_y_log10 with the ggplot2 function scale_y_log10 I get exactly the same error. Therefore, I believe the error is coming from ggplot2 rather than xgxr.

— Reply to this email directly, view it on GitHub https://github.com/Novartis/xgxr/issues/52#issuecomment-1124449356, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKVWLNX3QUUW7FZWFOQVTRTVJRPXXANCNFSM5VVCVVNQ . You are receiving this because you were assigned.Message ID: @.***>

iamstein commented 2 years ago

Oh, ok. Hmm. @erbw, could you put code or even just an image of the type of plot you'd like to create, just to help me understand better?

erbw commented 2 years ago

I eventually managed to create the plot I needed by doing the following steps

1) Create a function based on xgx_breaks_log10 changing the line

breaks <- labeling::extended(data_min, data_max, n_breaks, Q = preferred_increment)

to

breaks <- labeling::extended(max(data_min,0), data_max, n_breaks, Q = preferred_increment)

2) Use the above function to calculate breaks based on the range of my data: breaks=xgx_breaks_log10_zero(c(min(plotdata$D11,na.rm=T),max(plotdata$D11,na.rm=T)))

3) Replace xgx_scale_y_log10() with:

scale_y_continuous(trans=scales::pseudo_log_trans(base=10), breaks=breaks,

                 labels=xgx_labels_log10(breaks))

From: Andrew Stein @.> Sent: Thursday, May 12, 2022 8:49 AM To: Novartis/xgxr @.> Cc: Waldron, Edward-1 @.>; Mention @.> Subject: Re: [Novartis/xgxr] xgx_scale_y_log10 zero values (Issue #52)

This Message is from an External Sender. Do not click links or open attachments unless you trust the sender.

Oh, ok. Hmm. @erbwhttps://urldefense.com/v3/__https:/github.com/erbw__;!!N3hqHg43uw!pfztCvef2DP0SHDeDTFkfDSsOcEm2QBux6070a53PsSTulRZgb8uky0sjwcogbjbrNCtGZO5frI0l4yapdpGM4YL9aDEaIdcYQ$, could you put code or even just an image of the type of plot you'd like to create, just to help me understand better?

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/Novartis/xgxr/issues/52*issuecomment-1124952121__;Iw!!N3hqHg43uw!pfztCvef2DP0SHDeDTFkfDSsOcEm2QBux6070a53PsSTulRZgb8uky0sjwcogbjbrNCtGZO5frI0l4yapdpGM4YL9aBetKlzgQ$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AY6MGKBM2ZF3K6SRH4KERPDVJT42BANCNFSM5VVCVVNQ__;!!N3hqHg43uw!pfztCvef2DP0SHDeDTFkfDSsOcEm2QBux6070a53PsSTulRZgb8uky0sjwcogbjbrNCtGZO5frI0l4yapdpGM4YL9aD2Xw7NPw$. You are receiving this because you were mentioned.Message ID: @.**@.>>

iamstein commented 11 months ago

My understanding is that the pseudo-log transform is most useful in scenarios where you can have both positive and negative data, and there can be extreme values, and you somehow want to compress your data. https://win-vector.com/2012/03/01/modeling-trick-the-signed-pseudo-logarithm/

For most assays we use, they only take positive values or BLOQ, and there are standard ways to handle plotting this data (e.g., plot BLOQ points at teh BLOQ with different colors/symbols) as we do in xGx. Therefore, we don't plan to implement this idea.