oobianom / quickcode

An R package made out of mine and Brice's scrapbook of much needed functions.
https://quickcode.obi.obianom.com
Other
5 stars 0 forks source link

Compare Two Histograms in a Single Plot #9

Closed brichard1638 closed 11 months ago

brichard1638 commented 1 year ago

This function proposes to visually compare two distinct Histograms in a single plot. The idea for this function originated from the following article, found on the R-bloggers site:

https://www.r-bloggers.com/2023/09/histograms-with-two-or-more-variables-in-r/

However, the original function provided in the article has been slightly modified, improving its visual effect. The following information is provided in support of this function:

PROPOSED FUNCTION NAME: compHist

FUNCTION STRUCTURE: x1 = numeric vector 1 x2 = numeric vector 2 main = "Plot Title Name"

FUNCTION OUTPUT: A Single Plot Consisting of Two Colorized Histogram Plots with an Overlap Color that Joins Them

Ideally, there should be three additional arguments controlling for Histogram color including the overlap color. However, due to the structure of the function, the rgb function used in the col argument ONLY allows for the creation of a single set of colors. Preliminary testing revealed that changing the rgb values completely compromised the ability to control the overlap color that joins the Histograms.

There may be another means by which to control for histogram colorization including the overlap color. The modified skeleton code used in the construction of this function is provided as follows:

set.seed(249) x1 = rnorm(1000, mean = 0) x2 = rnorm(1000, mean = 2)

compHist(x1, x2, main = "Histogram of rnorm Distributions With Means 0 & 2")

minx = min(x1, x2) maxx = max(x1, x2)

hist(x1, main = main, xlab = "", ylab = "", col = rgb(0, 0, 1, alpha = 0.6), xlim = c(minx, maxx))

hist(x2, xlab = "", ylab = "", col = rgb(1, 0, 0, alpha = 0.6), add = TRUE)

legend("topright", legend = c("Mean: 0", "Mean: 2", "Overlap"), fill = c("lightslateblue", "salmon","mediumvioletred"))

oobianom commented 1 year ago

Working on development. A bit busy these days, but we will get there.

oobianom commented 1 year ago

I worked on this. Then I was thinking if it would be good to also give the user the ability to separate the plots initially before comparing. What do you think? You can test out the function and let me know. Maybe it defeats the purpose of the function in the first place.

brichard1638 commented 1 year ago

I tested and reviewed the compHist function which I understand is still under development. Adding the separate argument as a binary value really improves the utility of the function. It makes it easier for the user to use and apply in terms of capturing either separate or compared Histograms. Moving forward, I believe this functionality should remain within the function.

oobianom commented 1 year ago

Thanks Brice, alright we will leave it this way. I will work further on the documentation for it and you may take a look afterwards.

oobianom commented 1 year ago

This function is all set, Brice! You may take a look before I close the issue

brichard1638 commented 1 year ago

The compHist function has been tested in quickcode version .6

The compHist function passed the plot test referenced by its corresponding documentation.

However, the compHist function failed the plot test when color variables in the color argument are modified. When the colors are changed, only plot legend colors change. The colors defining the Histogram do not change to match the plot's legend colors.

The following code is provided to reproduce the error: x1 = rnorm(1000, mean = 0) x2 = rnorm(1000, mean = 2)

library(quickcode) compHist( x1 = x1, x2 = x2, title = "Histogram of rnorm Distributions with Means 0 & 2", color = c("red", "blue", "green") )

RECOMMENDATION: It may be easier for the user to discern the difference between combined Histogram colorization and the overlap color by differentiating them within the function. For example, breaking out the color argument to hcol1, hcol2, and overlapCol or a similarly defined term would better clarify these color distinctions. It would also control the use of color within the function as providing more than three colors in the original colors argument should crash the algorithm.

In addition, providing only one color for each itemized argument provides better configuration guidance for the user.

oobianom commented 1 year ago

I have partially addressed this! The overlap color would not be defined by the user, it will be determined based on the overlap between the colors. Take a look when you get a chance.

For the other part, I am leaning towards the recommendation to declare hcol1 and hcol2. However, it may seem like too many argument for the user to enter.

brichard1638 commented 1 year ago

I have concluded the tests on the compHist function. This function has passed the unit tests against which it was tested. The following feedback is provided:

oobianom commented 12 months ago

Hey Brice, I separated the color argument. For now, I have left the ylab argument in it, but the default is "Frequency" as you've suggested above. For your final point about one to three recommended color-pair, I agree very much. Do you have suggestion for color-pair that we should add?

oobianom commented 12 months ago

I added the follow now -

Recommended color pairs col1 = #00539C (and) col2 = #EEA47F col1 = brown (and) col2 = beige col1 = pink (and) col2 = #2F3C7E col1 = red (and) col2 = yellow col1 = limegreen (and) col2 = blue col1 = #990011 (and) col2 = #317773

brichard1638 commented 12 months ago

I successfully retested the compHist plot based on the latest changes made to the quickcode package. The following feedback is provided:

Using the HexToCol function, hex colors referenced in the documentation have been converted to their closest string color in R:

00539C = dodgerblue4

EEA47F = darksalmon

2F3C7E = royalblue4

990011 = darkred

317773 = aquamarine4

oobianom commented 11 months ago

Thanks for the review. To your point #1: Yes I did this on purpose previously, but I can have it not overwrite. To your point #2: Let me see if I can refine that mix.color function further. To your point #3: This is great and I agree, so I will include this changes

oobianom commented 11 months ago

Updated, Brice!

brichard1638 commented 11 months ago

I tested the compHist function in the latest version of quickcode version .6. Each of the three bullet points recently cited as issues have been successfully addressed in this latest package version.

The only thing I would add to the documentation for this function, perhaps under the section Some Recommended Color Pairs would be to add the following statements:

While it's not in your recommended color pair list, the color combination of purple-yellow can also be used to generate a nice plot visualization.

brichard1638 commented 11 months ago

Obi:::

Can you publish now? I am looking forward to the .6 version of quickcode!!! Would it be possible to include the archive function assessment in the .7 version of quickcode? I also already have a few new functions ready to be presented for your review in the .7 version.

After .6 is published, I will present them to you.

oobianom commented 11 months ago

Great, Brice! I will add that note. No worries, I will also add purple-yellow.

oobianom commented 11 months ago

Regarding having the archive function assessment in 0.7, that's fine. I eagerly anticipate hearing your ideas! Again, no rush. We will get 0.6 out first sometime next weekend.

Before that, I want to let you know that I also added two functions 'in.range' and 'seq3' functions for 0.6. If you have time before next weekend, you can take a look at them both and review. If you have feedback, you may open new issues so we can address them together.