Vindaar / ggplotnim

A port of ggplot2 for Nim
https://vindaar.github.io/ggplotnim
MIT License
175 stars 15 forks source link

[Bug] Missing data points when combine facet_wrap() with both the color and shape aesthetics #180

Closed 18874studentvgu closed 2 weeks ago

18874studentvgu commented 4 weeks ago

How to replicate:

Not sure if the alpine image is the problem or not so I'm listing it here, try adding these steps if you can't reproduce the bug

Data:

Studentenstatistik_BB_Datensatz.csv (Modified from here)

The problematic code:

import datamancer
import ggplotnim

let df = readCsv("csv/Studentenstatistik_BB_Datensatz.csv",sep=';')
ggplot(df, aes(y="Counting",x="Winter_semester",color ="Subject_group",shape="Type_of_university"))  + 
    geom_point() +
    facet_wrap(["Gender"]) + 
    ggsave("graph/buggy2.svg",width = 1080.0,height=720)

Output:

buggy2

Compared to the equivalent code using plotnine, an entire green to blue color range is gone! ref2

Additional information:

I did try to reproduce the bug with some combination of the 3 elements, to no avail. The bug only seems to appear when both the shape and color aesthetics alongside the facet_wrap command is present in the plotting chain.

I feel like this could need a bit of an investigation.

Vindaar commented 4 weeks ago

Hey!

Thanks for the detailed report. Huh, that is indeed funny. I'll try to look into it soon!

Vindaar commented 4 weeks ago

Ouch, I think it's (at least partially) a nasty bug in Datamancer's arrange for more than 2 columns. Investigating...

Vindaar commented 3 weeks ago

The issue was indeed "just" the arrange bug in Datamancer. Unfortunately right now I'm fighting an ORC / ARC bug, which causes a segfault on (it seems) everything but devel in the new logic... :/

Vindaar commented 3 weeks ago

I'm unsure about what to do right now. I hope that the fix to the regression that is there in Nim 2.0 was already backported and will show up in a new release soon. Fortunately, at least the release of 2.2 should also be very close.

I can merge the Datamancer PR linked above, but given that you are on 2.0.4 it won't actually fix your problem (it'll just make it work by breaking every arrange call). Hmm. If I come up with something better, I'll comment here.

Vindaar commented 3 weeks ago

With Nim 2.0.6 released, I can finally merge the Datamancer PR. Once that is done, please update your datamancer version and to Nim 2.0.6 and let me know if the plot looks as expected (it does here).

Nim 2.0.6: https://nim-lang.org/blog/2024/06/17/version-206-released.html

Vindaar commented 3 weeks ago

Datamancer version v0.4.5 is tagged now. Please let me know if it all works after updating!

Vindaar commented 2 weeks ago

Here's your plot with Datamancer v0.4.5 and some minor "prettification":

import ggplotnim

const path = "~/CastData/ExternCode/Datamancer/data/Studentenstatistik_BB_Datensatz.csv"

let df = readCsv(path,sep=';')
ggplot(df, aes("Winter_semester", "Counting", color = "Subject_group", shape = "Type_of_university")) +
  geom_point() +
  facet_wrap(["Gender"]) +
  margin(right = 15) +
  scale_x_continuous(breaks = @[2010, 2015]) +
  xlab("Winter semester") +
  discreteLegendWidth(0.5) + discreteLegendHeight(0.5) +
  legendFont(tickFont = font(8.0)) +
  ggsave("/tmp/fixed.pdf", width = 1080, height = 480)

(converted to a PNG for GH):

fixed

I'm closing the issue for now. Feel free to reopen / open a new issue if you need any help!

Just a minor hint:

If you wish to adjust the way a plot looks, let me point you to the Theme type: https://github.com/Vindaar/ggplotnim/blob/master/src/ggplotnim/ggplot_types.nim#L395-L485

You can either adjust it in the way I've done up there by calling individual functions, or you can even just define a TOML file that contains the Theme adjustments. This example here is from recently when I had to produce some plots for a poster (hence the huge fonts and weird colors):

[Theme]
titleFont = "font(18.0)"
labelFont = "font(18.0)"
tickLabelFont = "font(14.0)"
tickLength = 10.0
tickWidth = 2.0
gridLineWidth = 2.0
legendFont = "font(14.0)"
legendTitleFont = "font(18.0, bold = true)"
facetHeaderFont = "font(18.0, alignKind = taCenter)"
baseLabelMargin = 0.5
annotationFont = "font(9.0, family = \"monospace\")"
continuousLegendHeight = 2.2
continuousLegendWidth = 0.5
discreteLegendHeight = 0.6
discreteLegendWidth = 0.6
plotMarginRight = 5.0
plotMarginLeft = 3.0
plotMarginTop = 1.0
plotMarginBottom = 2.5
canvasColor = "#7fa7ce"
baseScale = 1.5

which you then use in your code by passing a:

... + tomlTheme("path_to_file.toml") + ...

into the "plot chain".