corybrunson / ggalluvial

ggplot2 extension for alluvial plots
http://corybrunson.github.io/ggalluvial/
GNU General Public License v3.0
497 stars 34 forks source link

White lines within flow between axes #23

Closed marijn1990 closed 1 year ago

marijn1990 commented 6 years ago

Hi Cory (and others), as discussed on Twitter: I am making alluvial diagrams using ggalluvial but have encountered a problem. Briefly, I am comparing microbiome membership categories per participant between two visits (visit1, visit2). The flows between the two axis, however, show white lines between the different participants which probably bothers me more than it should ^^. When using tiff at a high resolution (e..g. res=900) this improves a bit, but it is still visible on an A4-sized print.

I have added an example data .txt file for illustration: example_dataset.txt I used the following code in R:

example_dataset <- read.delim("example_dataset.txt",row.names=1,check.names=FALSE,sep=",")
example_dataset <- as.data.frame(example_dataset)
value <- c("#000000", "#DC0000", "#FF9600", "#329600", "#000000", "#DC0000", "#FF9600", "#329600")
N <- ggplot(data = example_dataset, aes(axis1 = visit1, axis2 = visit2)) +
     scale_x_discrete(limits=c("visit 1", "visit 2")) +
     geom_alluvium(fill = "darkgrey", na.rm = TRUE) +
     geom_stratum(width = 1/3, fill = value) + 
     labs(y="", x="Event") + 
     theme_light() + theme(panel.grid.major.x = element_blank(), panel.grid.major.y = element_blank(), panel.grid.minor.y =  element_blank())

tiff(filename="example.jpg", width=10, height=8, units="in", res=50)
N
dev.off()

I get this image: image

Any idea how to improve this image? As said, increasing the resolution did not improve much. Thank you for your help!

mbojan commented 6 years ago

My intuition is that you have multiple rows such as, say, visit1=orange and visit2=green so that multiple alluvia are drawn.

corybrunson commented 6 years ago

@marijn1990 i've reproduced the behavior, thanks! I don't think the issue is that alluvia are overlapping, which is what i read @mbojan to be suggesting (is that right?). Rather, separate alluvia are being drawn for each row of data, and since they're drawn adjacent to each other a slight gap remains between them.

Using your code, i've found two ways to prevent this. One is to use the aggregate.y parameter in the alluvium layer, i.e. instead of

geom_alluvium(fill = "darkgrey", na.rm = TRUE)

write

geom_alluvium(fill = "darkgrey", na.rm = TRUE, aggregate.y = TRUE)

. The other is to make a flow layer instead of an alluvium layer, i.e.

geom_flow(fill = "darkgrey", na.rm = TRUE)

. Please give those two changes a try! They have slightly different behavior, but in both cases (for me) the thin white lines go away.

mbojan commented 6 years ago

Yep, I meant distinct "parallel" alluvia for separate rows linking the same categories (color rectangles).

marijn1990 commented 6 years ago

That works very well! The only issue that still remains that one line persists at the level between visit2=orange and visit2=red, as these are still considered two different alluvia after aggregation, but that can be solved by hand. Thank you very much for your help!

corybrunson commented 6 years ago

@marijn1990 you're right—those are still separate flows, in my implementation. Sorry! I don't have plans to change this, since i personally prefer it, but i'll leave this issue open for now, in case others would support making an option available to change it. (I'd also welcome a pull request.)

marijn1990 commented 6 years ago

@corybrunson I agree, I think there are many cases in which a separation between alluvia is preferable. Thanks again!

modche commented 2 years ago

@marijn1990 i've reproduced the behavior, thanks! I don't think the issue is that alluvia are overlapping, which is what i read @mbojan to be suggesting (is that right?). Rather, separate alluvia are being drawn for each row of data, and since they're drawn adjacent to each other a slight gap remains between them.

Using your code, i've found two ways to prevent this. One is to use the aggregate.y parameter in the alluvium layer, i.e. instead of

geom_alluvium(fill = "darkgrey", na.rm = TRUE)

write

geom_alluvium(fill = "darkgrey", na.rm = TRUE, aggregate.y = TRUE)

. The other is to make a flow layer instead of an alluvium layer, i.e.

geom_flow(fill = "darkgrey", na.rm = TRUE)

. Please give those two changes a try! They have slightly different behavior, but in both cases (for me) the thin white lines go away.

This helps a lot! Should be in the examples!

corybrunson commented 1 year ago

I lost track of this! Reopening to remind myself to consider a new example.

corybrunson commented 1 year ago

I believe this has been addressed in some stat_alluvium() examples, using the renamed parameter cement.alluvia. Please feel free to reopen this issue if these examples miss something crucial that was resolved here.