philips-software / latrend

An R package for clustering longitudinal datasets in a standardized way, providing interfaces to various R packages for longitudinal clustering, and facilitating the rapid implementation and evaluation of new methods
https://philips-software.github.io/latrend/
GNU General Public License v2.0
28 stars 5 forks source link

Modifying plot colors #155

Open hichew22 opened 4 months ago

hichew22 commented 4 months ago

Hello,

Previously in plotting the individual and cluster trajectories, the individual trajectories were black, and the cluster trajectories were colored (red, blue, green, etc.) like so:

image

However, now this is reversed where the individual trajectories are colored and the cluster trajectories are black (my plot below). How can I revert to the original version?

image

Thank you!

niekdt commented 4 months ago

I'm thinking to rework the plotting such that plotClusterTrajectories() is consistent in drawing colored trends, as it does different things right now depending on the options, which is confusing. plot() will become the adaptive function then.

In the meantime, you can use:

data(latrendData)
method <- lcMethodLMKM(Y ~ Time, id = "Id", time = "Time")
model <- latrend(method, latrendData, nClusters = 3)

ggplot() + 
    geom_line( data = trajectories(model), aes( x = Time, y = Y, group = Id)) + 
    facet_wrap(~ Cluster) + 
    geom_line(data = clusterTrajectories(model), aes(x = Time, y = Y, color = Cluster))

image

hichew22 commented 4 months ago

Got it, thank you very much, Niek!

hichew22 commented 3 months ago

Hi Niek, I am trying to add the percentage of each cluster in parentheses in the facet to the plot with black individual trajectories and colored cluster trajectories. What would be the easiest way to do this?

Something like this?

df_kml_model_traj <- trajectories(kml_model_4) # Extract individual trajectories
df_kml_model_cluster_traj = clusterTrajectories(kml_model_4) # Extract cluster trajectories

cluster_percentages <- prop.table(table(df_kml_model_traj$Cluster))
cluster_labels <- sprintf("%s (%d%%)", names(cluster_percentages), round(cluster_percentages * 100))
new_labels <- as_labeller(c("A" = cluster_labels[1], 
              "B" = cluster_labels[2],
              "C" = cluster_labels[3],
              "D" = cluster_labels[4]))

ggplot() +
  geom_line(
    data = df_kml_model_traj,
    aes(x = Time, y = Y, group = Id),
    color = "black",
    alpha = 0.5
  ) +
  geom_line(
    data = df_kml_model_cluster_traj,
    aes(x = Time, y = Y, color = Cluster),
    size = 2,
    show.legend = FALSE
  ) +
  facet_wrap(~ Cluster,
             labeller = new_labels)
niekdt commented 3 months ago

I'm not familiar with ggplot's labeller. An alternative way is to create a new cluster column with the labels that you want (that's how I implemented it in plotClusterTrajectories)

You can use clusLabels = make.clusterPropLabels(clusterNames(kml_model_4), clusterSizes(kml_model_4)) to generate the labels. Then create a new column, remapping the clusters to the labels:

df_kml_model_cluster_traj$ClusterLabel = factor(df_kml_model_cluster_traj$Cluster, levels = clusterNames(kml_model_4), labels = clusLabels)

Then use ClusterLabel as the grouping/facet/color variable in ggplot

hichew22 commented 3 months ago

When I follow the above, the labels now appear. However, each facet now contains all the individual trajectories. I wonder if this is because the df_kml_model_traj does not have the ClusterLabel variable?

niekdt commented 3 months ago

Yes you'll need to add identical labels to that data frame too