visdesignlab / upset2

UpSet - Visualizing Intersecting Sets
https://upset.multinet.app/
BSD 3-Clause "New" or "Revised" License
42 stars 7 forks source link

'elementName' property of visible sets, visible subsets and all sets holds slightly different values #279

Closed elizaan closed 8 months ago

elizaan commented 8 months ago

1. "rawData": { "sets": { Set_Blue_Hair": { "id": "Set_Blue_Hair", "elementName": "Blue Hair", "items": [ "simpsons/40726828", "simpsons/40726834", "simpsons/40726842" ], ...... here element name is "Blue Hair". But,

2. "processedData": { "values": { "Subset_Blue_Hair": { "id": "Subset_Blue_Hair", "elementName": "Blue_Hair", ... Here, elementName is "Blue_Hair".

3. "Subset_School_Blue_Hair Male": { "id": "Subset_School_Blue_Hair Male", "elementName": "School Blue_Hair Male", .... Here, elementName is "School Blue_Hair Male".

4. "accessibleProcessedDats":{ "values":{ "Subset_Blue_Hair": { "elementName": "Just Blue_Hair", "type": "Subset", ...... Here, elementName is "Just Blue_Hair".

5. "accessibleProcessedDats":{ "values":{ "Subset_School_Blue_Hair Male": { "elementName": "School & Blue_Hair & Male", ...... Here, elementName is "School & Blue_Hair & Male".

In the 1st example, "Blue Hair" doesn't have any '_' in between, and could it be maintained in all elementName? In the 2nd example, subset_Blue_Hair is, I believe, 'just blue' hair as in the 4th item, but 'Just' is not mentioned, and it looks fine as the element name. Though it could be "Blue Hair" for 3, elementName could be "School, Blue Hair & Male" for 4, elementName could be "Blue Hair" For 5, elementName could be "School, Blue Hair & Male"

elizaan commented 8 months ago

In the different stages of code, we need to read data from all of these keys, and the inconsistency in elementName often makes it calculate wrong statistical information/ wrong contexts.

Blue Hair is just an example that I found out while calculating how much percentage of Blue Hair (smallest set) is present in all of the intersections, and it has calculated a 0% because of name mismatch, which is not the correct value. However, overall, I think consistency should be maintained for all the other elements' elementName to avoid such data processing in the alternative text side and to read data easily from Data JSON.

JakeWags commented 8 months ago

one thing to note is that any instance of single set subsets in accessible data should be Just Blue Hair and that should always have Just added to it