deneb-viz / deneb

Deneb is a custom visual for Microsoft Power BI, which allows developers to use the declarative JSON syntax of the Vega or Vega-Lite languages to create their own data visualizations.
https://deneb-viz.github.io
MIT License
190 stars 15 forks source link

Dataset losing rows #271

Closed gdeckler closed 1 year ago

gdeckler commented 1 year ago

Noticed this working with the DS - Penguins dataset. If you only include Island, Species, Beak Depth (mm) and Body Mass (g) in your Deneb visual rows are dropped from the dataset available to Vega/Vega-Lite. For example, there are 2 penguins on the island of Dream that are species Adelie with a Body Mass (g) of 3400 and a Beak Depth (mm) of 17.1. Only 1 of these rows is available in Deneb visual, the other duplicate row is dropped. This seems very reminiscent of the behavior of Python and R visuals that essentially de-duplicate rows. Is this intended behavior, something that is out of your control or ? In this circumstance this can be avoided by including Index within the visual but wanted to check.

dm-p commented 1 year ago

Hi Greg - yes, this is due to how Power BI aggregates to distinct values and out of a visual's control to alter. It's alluded to in the doc, but probably could be better stated. The solution is as you suggest, but it's also one of the neat features of Deneb/R/Python in that you can add columns to affect the grain of the dataset, but they don't necessarily need to be "used" by the generated visual specification.

I'll be reviewing and updating most of the documentation as part of v2, so I'll add this to my list of things to flesh out (or you're welcome to submit a PR against the doc site if you wish to add something suitable in the meantime).

gdeckler commented 1 year ago

Yep, had a feeling that is what it was, would be nice if there was an option in Power BI to not deduplicate data. Thanks for confirming!

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: Daniel Marsh-Patrick @.> Sent: Tuesday, December 13, 2022 3:16:05 PM To: deneb-viz/deneb @.> Cc: Gregory Deckler @.>; Author @.> Subject: Re: [deneb-viz/deneb] Dataset losing rows (Issue #271)

[EXTERNAL] - Caution: This email originated from outside of Fusion Alliance. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Hi Greg - yes, this is due to how Power BI aggregates to distinct values and out of a visual's control to alter. It's alluded to in the dochttp://localhost:3000/docs/next/dataset#grain--row-context:~:text=Internally%2C%20the%20visual%20handles%20its%20dataset%20in%20much%20the%20same%20way%20as%20a%20core%20table%2C%20i.e.%20the%20number%20of%20rows%20in%20the%20dataset%20is%20equivalent%20to%20the%20combination%20of%20all%20unique%20values%20across%20all%20columns%20and%20measures%20added., but probably could be better stated. The solution is as you suggest, but it's also one of the neat features of Deneb/R/Python in that you can add columns to affect the grain of the dataset, but they don't necessarily need to be "used" by the generated visual specification.

I'll be reviewing and updating most of the documentation as part of v2, so I'll add this to my list of things to flesh out (or you're welcome to submit a PR against the doc site if you wish to add something suitable in the meantime).

— Reply to this email directly, view it on GitHubhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdeneb-viz%2Fdeneb%2Fissues%2F271%23issuecomment-1349630370&data=05%7C01%7Cgdeckler%40fusionalliance.com%7Cb0c80ae3031e4969719f08dadd46dd88%7C4a042743373a43d2827b003f4c7ba1e5%7C1%7C0%7C638065593715835678%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C7000%7C%7C%7C&sdata=dr26bvPga3jFiUHZsB6jnAE01SfLQHdRa1fXPvIu4hk%3D&reserved=0, or unsubscribehttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADNQA7YWAZGDHYOJ7UU4T23WNDKQLANCNFSM6AAAAAAS5T2P3M&data=05%7C01%7Cgdeckler%40fusionalliance.com%7Cb0c80ae3031e4969719f08dadd46dd88%7C4a042743373a43d2827b003f4c7ba1e5%7C1%7C0%7C638065593715835678%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C7000%7C%7C%7C&sdata=KGGqsVDmAfJycwYGa%2BZqiQXtroPail4fxb11dOS6O%2Fo%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>