biolab / orange3

🍊 :bar_chart: :bulb: Orange: Interactive data analysis
https://orangedatamining.com
Other
4.85k stars 1.01k forks source link

The Box Plot whiskers Min and Max Do Not Represent Q1- IQR x 1.5 and Q3+IQR x 1.5 #6260

Closed rehoyt closed 1 year ago

rehoyt commented 1 year ago

What's wrong?

How can we reproduce the problem?

What's your environment?

alt="Box Plot JASP" src="https://user-images.githubusercontent.com/25651922/207962763-a02f234a-c2a8-4d20-af9a-e821dca9d5c0.png">!

BoxPlot LSS

janezd commented 1 year ago

Should they?

https://en.wikipedia.org/wiki/Box_plot

rehoyt commented 1 year ago

The Wikipedia link does include Box Plots for outliers and using that approach they define outliers as 1.5 x IQR either subtracted from Q1 or added to Q3. I can calculate that manually, but I think it would be useful to have that part of the Box Plot widget.

Of course, if you disagree that's fine

Bob

On Thu, Dec 15, 2022 at 2:59 PM Janez Demšar @.***> wrote:

Should they?

https://en.wikipedia.org/wiki/Box_plot

— Reply to this email directly, view it on GitHub https://github.com/biolab/orange3/issues/6260#issuecomment-1353700687, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDWVUQCDXCHBJL6JNV7HPTWNOBEVANCNFSM6AAAAAATAFJFBY . You are receiving this because you authored the thread.Message ID: @.***>

--

Robert (Bob) Hoyt MD, FACP, FAMIA, ABPM-CI

Associate Clinical Professor, Department of Internal Medicine

Virginia Commonwealth University

Richmond, VA

CAPT (Ret) USN

InformaticsEducation.org http://InformaticsEducation.org

nocodedatascience.net @.***

Cell: 850-384-5235

QR Code for CV

rehoyt commented 1 year ago

Dave Patrishkoff pointed out that the box plot assumes a normal distribution. https://ai.plainenglish.io/use-adjusted-boxplot-for-skewed-distribution-d1bc0ec25f6d

In the case of the heart disease prediction dataset, males have a fairly normal distribution, whereas females do not. That is why you would need an "adjusted box plot" to determine the outliers accurately

Bob

On Thu, Dec 15, 2022 at 3:07 PM Bob Hoyt @.***> wrote:

The Wikipedia link does include Box Plots for outliers and using that approach they define outliers as 1.5 x IQR either subtracted from Q1 or added to Q3. I can calculate that manually, but I think it would be useful to have that part of the Box Plot widget.

Of course, if you disagree that's fine

Bob

On Thu, Dec 15, 2022 at 2:59 PM Janez Demšar @.***> wrote:

Should they?

https://en.wikipedia.org/wiki/Box_plot

— Reply to this email directly, view it on GitHub https://github.com/biolab/orange3/issues/6260#issuecomment-1353700687, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDWVUQCDXCHBJL6JNV7HPTWNOBEVANCNFSM6AAAAAATAFJFBY . You are receiving this because you authored the thread.Message ID: @.***>

--

Robert (Bob) Hoyt MD, FACP, FAMIA, ABPM-CI

Associate Clinical Professor, Department of Internal Medicine

Virginia Commonwealth University

Richmond, VA

CAPT (Ret) USN

InformaticsEducation.org http://InformaticsEducation.org

nocodedatascience.net @.***

Cell: 850-384-5235

QR Code for CV

--

Robert (Bob) Hoyt MD, FACP, FAMIA, ABPM-CI

Associate Clinical Professor, Department of Internal Medicine

Virginia Commonwealth University

Richmond, VA

CAPT (Ret) USN

InformaticsEducation.org http://InformaticsEducation.org

nocodedatascience.net @.***

Cell: 850-384-5235

QR Code for CV