Upgrade gating - Githubissues

william-hutchison commented 3 months ago

Update the gate_interactive function to allow colour, shape, size and alpha to be defined by either a vector of values or a fixed value.
Simplify interactive gating backend code.
Improve interactive gating visual interface.
Deprecate superseded gate functions.
Update documentation.

Let me know if there are any issues or if you would like to see any changes. I am satisfied with this as the final iteration of the tidygate upgrade. Please push to CRAN when possible so that I can remove the github remote from tidySpatialExperiment.

william-hutchison commented 3 months ago

In addition to updating the depreciation message, I also switched the gate output to a vector of strings.

william-hutchison commented 3 months ago

gate_programmatic and gate_interactive are no longer exported. They have been replaced with a single function - gate.
For simplicity, additional information about selected points is no longer saved to tidygate_env.

william-hutchison commented 3 months ago

Thank you for the suggestions and spotting these issues.

1 - This is happening because size and alpha coerce the input column to a factor and hp has too many levels for the plot to handle. I have now limited factors for size, shape and alpha to 6 in the documentation and provide an informative error message when outside of this range.

2 - This depreciation warning is caused by my use of an outdated syntax in ggplot2, I have updated the syntax.

3 - I have added a simple message reminding the user that interactively drawn gates are temporarily saved: "tidygate says: interactively drawn gates are temporarily saved to tidygate_env$gates"

4 - I agree that returning a boolean would be more elegant, but it would not be possible to do this separately for each gate within mutate. We could either loose gate information or return some kind of tibble or list of values, but I feel both of these solutions do more harm than good. The currently returned string allows the user to very easy see which points are in which gates, and select points which appear in gates of interest, as demonstrated in the README.

The output is currently "1" if the point is only in gate 1, but expands endlessly to "1,2,4,8" ect capturing each gate the point is in. Following this logic, I think it is best to have a point which appears in no gates as an empty string, as 0 suggests it appears within a gate called 0.

5 - I agree it would be better if the plot launches within the RStudio quadrant interface by default, if running in RStudio. This was strangely the default behaviour for me already so I cannot test, but I have tried to fix this for others. Is the default interface now the RStudio quadrant for you too?

6 - I have added an example of saving and loading gates to the README / vignette.

7 - I have replaced the "gated_programmatic" and "gated_interactive" variable names with "gated" for storing gate function output in the documentation and README / vignette

william-hutchison commented 3 months ago

I have enabled numeric vector input to size and alpha parameters and updated the documentation to match.

Regarding 0s or empty strings, I agree the empty strings do look a bit odd in the tibble, but to me, the 0s are quite confusing. Another option could be to set points appearing in no gates to NA, but this also looks odd in the tibble. To me, the most elegant solution is to leave the strings empty.

stemangiola commented 3 months ago

I have enabled numeric vector input to size and alpha parameters and updated the documentation to match.

Regarding 0s or empty strings, I agree the empty strings do look a bit odd in the tibble, but to me, the 0s are quite confusing. Another option could be to set points appearing in no gates to NA, but this also looks odd in the tibble. To me, the most elegant solution is to leave the strings empty.

I like NA as a solution; it does not look odd and is actually correctly non-applicable, and we don't have any gate for those points.

Plus, if the user converts to numeric, NA preserved it's behaviour, while empty, becomes NA.

william-hutchison commented 3 months ago

Sounds good, I have replaced empty strings with NAs.

stemangiola commented 3 months ago

Thanks,

the command

mtcars_gated <- 
    mtcars |>
    mutate(gated_interactively = gate(x = mpg, y = wt, colour = wt, size =hp))

fails, while it works for the old framework

mtcars_gated <- 
    mtcars |>
    mutate(gated_interactively = gate_chr(.dim1 = mpg,  .dim2 = wt, .color = wt, .size = hp))

Please adapt the old-framework strategy to the argument as much as you can, I think we risk to be reinventing the weal.

Also a comment on the thresholds of the argument (when continuous). ggplot does not impose thresholds for the user, eliminating complexity, and the need to read manual.

tibble(a = 1:10, b = 1:10) |> ggplot(aes(a, b, alpha = a, size = b)) + geom_point()

We should not either. Remember do not expose backend, technical needs to the user. The API should do the heavy lifting not the user.

william-hutchison commented 3 months ago

I mistakenly thought ggplot used the actual values supplied to size and alpha which is why I imposed the limits. I realise now this is not the case and have removed the limits, which caused the error. Thanks for the suggestion. Unfortunately plotly seems to be unable to display the continuous size and alpha legends, so we might have to proceed without these for the moment.

stemangiola commented 3 months ago

I mistakenly thought ggplot used the actual values supplied to size and alpha which is why I imposed the limits. I realise now this is not the case and have removed the limits, which caused the error. Thanks for the suggestion. Unfortunately plotly seems to be unable to display the continuous size and alpha legends, so we might have to proceed without these for the moment.

Sure, legends are not the end of the world.

stemangiola / tidygate

Upgrade gating #27