Explanation of "Aggregation of religious groups"

tutebatti commented 2 years ago

In the info text of the map, I find the following sentences:

If in all shown map glyphs no more than four nodes of a lower part in the religion hierarchy would be present, the data is aggregated on that lower level. For example, if the map would only show two glyphs; one with the Latin Church, the Coptic Church, and the Georgian Orthodox Church; and the other with the Latin Church and the Rabbanites; each of these religious denominations could be represented by an individual circle. This will only happen if it is possible in all glyphs, as doing otherwise would skew the perceived variety of religions.

I hardly understand this. Is there something wrong, e.g. with the first phrase in the <em> tag?

mfranke93 commented 2 years ago

Is there something wrong, e.g. with the first phrase in the <em> tag?

Not that I can see, no. But I know how this was designed and was present for all discussions that lead to the way it is designed now, so maybe I am "too close" to see the issue. If you can't understand it based on the text, that is a good indication that we should explain it further.

A side note to start with is that the current behavior is based on many iterations, each of which we discussed in length with Dorothea and Ralph. I will try to represent the timeline of how this evolved until the current version so that you can better understand why it works how it does now.

The functionality here is based on how the hierarchy of religions looks like:

There is one main rule for the map glyphs: They never have more than four circles!
Originally, this was even more specific: There is one circle for Christianity, one for Islam, one for Judaism. If no religions from one category are part of that glyph, the respective circle is not there. Then, the "other religions" category was added, and there was one more circle: up to four.
However, this then lead to less data being shown than was possible: If, for example, only two children of Christianity (e.g., ARM and LAT) were shown, every glyph had only one circle. And that made it much harder to see which of the two, or both, was represented in each glyph. The question came up: Why not show one or two circles in such cases, one for ARM, one for LAT? Of course, the solution needed to be more generic. So in the next iteration, it looked as such: If in the glyph no more than four individual religions are shown, show them as individual circles. Else, aggregate to the parent religions.
However, this now lead to behavior that gave the wrong impression about the data in some cases: With certain filters, glyphs with 2-4 religions would show these (and have 2-4 circles), glyphs with more than four would aggregate. The extreme case was when, for example, all religions belonging to the Christianity subtree were enabled in the religion filter: glyphs in modern Turkey with only SYR and MELK, or {SYR, ARM, MELK}, would show 3 or 4 circles. Aleppo, where all religions were present, would show only one. It would look like there was less diversity where there was actually more. This is what lead to the current functionality, and what is meant by "If in all shown map glyphs": Instead of deciding individually for each glyph whether it should be aggregated, do so once, for all at once. So, if all glyphs could represent their contained religions by ≤4 circles individually, do so. It would not have to be the same 4 religions. However, if even one (like Aleppo) could not, all glyphs need to aggregate, and the level of aggregation would then be the same in each glyph. The way the aggregation works is that we "cut off" the lower (further to the right in the religion hierarchy view) leaves of the hierarchy, and the religions that are cut off are all collected under their parent, until no more than four religions with data remain. Some examples:
1. We have two glyphs in total. One (A) has ARM, SYR, and RAB. The other (B) has ARM, RAB, MELK, and HIN. In sum, there are more than four religions, but in each glyph there are four or less. So, A has three circles, B has four, each with the individual religion.
2. We have two glyphs again. A has ARM, SYR, and 12S. B has ARM, RAB, MELK, 5S, and 12S. This time, if all religions would be represented, B would have five circles. So, the lowest level of hierarchy is eliminated. In this case, 5S and 12S are summarized under SHIA. The other religions are already on that level. Now, A has ARM, SYR, and SHIA. B has ARM, RAB, MELK, and SHIA. So, all glyphs have ≤4 circles, and this is how they are now represented.
3. We have two glyphs again. A has ARM, SYR, and 12S. B has ARM, RAB, MELK, 12S, and SUN. So, in B, there are again more than five. In the first step, we reduce the lowest level of hierarchy, so the Shiites are again aggregated. A now has ARM, SYR, and SHIA. B now has ARM, RAB, MELK, SHIA, and SUN. In this case, B still has more than four religions. So, the next level of hierarchy is reduced as well: A now has CHR and ISL. B now has CHR, ISL, and JEW. And we are now on the most aggregated level, where there cannot be more than four groups.

In summary: the objective is to show as much detail as possible, while still being consistent with the level of aggregation.

I hope that clears up the question, and that you can reformulate the passage to bring it across more clearly to a wider audience.

tutebatti commented 2 years ago

I'm baffled. I probably never used any filters that would lead to that behavior (or I didn't pay attention?). I thought clustering always leads to all religious groups belonging to one general religious affiliation being represented by one circle and one circle only.

I will need to double check the rest of the texts involving clustering to see if the behavior - which you explained nicely - should be mentioned.

mfranke93 commented 2 years ago

Try just filtering by two religious groups of Christianity, and nothing else ;)

tutebatti commented 2 years ago

I could reproduce the behavior easily once I understood what is meant. But it seems I never encountered multiple circles with a cross as part of one glyph...

tutebatti commented 2 years ago

Since the three children of SHIA will be removed, I think it becomes easier to talk of "religious groups" and "general religious affiliation" only, although this is less generic. Cf.

a lower part in the religion hierarchy would be present

mfranke93 commented 2 years ago

Since the three children of SHIA will be removed, I think it becomes easier to talk of "religious groups" and "general religious affiliation" only, although this is less generic. Cf.

a lower part in the religion hierarchy would be present

Well. This is something I did not know would happen, and which is currently not planned as part of the export to DaRUS (and hence, the public version). Maybe this should be discussed with @rpbarczok also, maybe in #3.

Second: The whole functionality is generic to the data, and it would be good to keep the description of the behavior generic as well. Just because in the case of the public version at HU, there are only two hierarchy levels, that doesn't mean this will always be the case. So the description should not be too specific about there only being two levels.

In case this relates to my mention of "groups" here in 4.iii.: What I meant here by "group" is all religions that are collected into one circle.

mfranke93 commented 2 years ago

See also issue 180 @ TIK for previous discussions. As far as I see it, the solution we thought about there is to "hide" the exact affiliation of Shiite evidence for the public version. As far as I know, this is not planned for DaRUS (@rpbarczok ?). So this would be a minor difference that actually exists between these two data version.

tutebatti commented 2 years ago

This is something I did not know would happen, and which is currently not planned as part of the export to DaRUS (and hence, the public version).

I might be mistaken, but I think their removal is explained in the Prolegomena which should, of course, match the data in DaRUS and the public version. Again, what does @rpbarczok say? :smile:

What I meant here by "group" is all religions that are collected into one circle.

So the behavior is the same for all nodes below the highest level? It is hard to test because there are so little evidences with the three subgroups of SHIA.

The problem is that the "average humanist" is not used to think in categories of multiple hierarchies in this case. At any rate, I suggest that I will try to rephrase and then we can correct that, if it is not generic enough.

mfranke93 commented 2 years ago

So the behavior is the same for all nodes below the highest level? It is hard to test because there are so little evidences with the three subgroups of SHIA.

The problem is that the "average humanist" is not used to think in categories of multiple hierarchies in this case. At any rate, I suggest that I will try to rephrase and then we can correct that, if it is not generic enough.

Yes, exactly. It would even work with four, five, ... levels of hierarchy. And yes. This behavior was more obvious (and maybe more needed) when we still had the differentiation between Chalcedon and Non-Chalcedon churches. Old screenshot:

tutebatti commented 2 years ago

See, even in the field of software development, historical inquiry is fundamental to understanding! :stuck_out_tongue_winking_eye:

Joking aside, I have yet another question: As far as I understood, the clustering of map glyphs changes with the zoom level (which makes totally sense). Is it correct, though, that the aggregation of religious groups into circles is independent of zoom level? Hence, I get the following despite what we have just discussed: grafik

mfranke93 commented 2 years ago

Is it correct, though, that the aggregation of religious groups into circles is independent of zoom level?

No. The rules I mentioned are re-evaluated on each zoom level. Basically, we can break this down to a hypothetical scenario in which only one map glyph and the religions it contains are solely responsible for the aggregation happening. In reality, there might be more, but the principle stays the same. Then, there are three possibilities:

1a: The glyph represents one place. Zooming in further does not affect which religions are in the glyph, and the overall aggregation level of religions in the map does not change.
1b: The glyph represents multiple places, but one of these has all the religions that are responsible for aggregation. In this case, zooming in so far that the glyph splits up and the responsible place is represented as a single glyph does not affect the end result.
2: The glyph represents multiple places that trigger aggregation if combined, but not if separate. In that case, zooming in so far that the glyph is split up would change the aggregation level. Take a look at this state for an example of this. As soon as you zoom out once in the map, glyphs with five and six religions contained form, and the aggregation level changes.

tutebatti commented 2 years ago

Take a look at this state for an example of this.

This link does not work for me, somehow. If click on it, I'm shown this page: grafik

If I then click on "Erlauben", this is shown: grafik

At any rate, regarding your explanation and my screenshot from my previous comment, I don't understand why the second map glyph from the left is aggregated. None of the glyphs would have more than four circles. Pardon me, if I'm still not understanding correctly.

mfranke93 commented 2 years ago

It seems like you were not yet logged in. In that case, try uploading the state directly instead: JSON

At any rate, regarding your explanation and my screenshot from my previous comment, I don't understand why the second map glyph from the left is aggregated. None of the glyphs would have more than four circles. Pardon me, if I'm still not understanding correctly.

Can you, in turn, give me a reproducable state for your example? Just the JSON from Settings > Persist State > Save visualization state.

tutebatti commented 2 years ago

It seems like you were not yet logged in.

I was, but maybe there's something wrong with my cookies or browser or whatever... The JSON worked.

Here is my persistent state file (I had to change the file extension to txt because github does not accept json files - how did you attach yours?): filter-for-issue-77_tutebatti.txt

mfranke93 commented 2 years ago

how did you attach yours?

I didn't, for that exact reason. I created a gist and linked to it.

tutebatti commented 2 years ago

Thanks... learning something new every day. :)

mfranke93 commented 2 years ago

As far as I understood, the clustering of map glyphs changes with the zoom level (which makes totally sense). Is it correct, though, that the aggregation of religious groups into circles is independent of zoom level? Hence, I get the following despite what we have just discussed:

I see, and I think I understand where the disconnect is now. And this is something fundamental that should probably be explained to visitors as well: This behavior is not limited to the glyphs within the visible area of the map, but to all glyphs on that zoom level. So, everything east, west, north and south of the currently visible part of the map is also already populated with glyphs. That is necessary so that they show up when you pan the map, and so glyph grouping doesn't suddenly change just because another city is within the map bounds when panning.

In your example, I could find at least one glyph with 6 religions:

tutebatti commented 2 years ago

I knew there was something missing. I was unclear (not only in my expression, but also thinking myself about the problem) about "zooming" - of course zooming shows a smaller area of the map but that area can be changed by panning...

tutebatti commented 2 years ago

The good thing about all this is that my ignorance anticipates other newbies to the visualization. :smiley:

tutebatti commented 2 years ago

I'm sorry I have to open this issue up again... but I'm struggling with distinguishing map glyphs "currently shown", i.e. rendered at all but off the current section, and "currently shown", i.e. visible on the screen. Any ideas?

One solution could be to have a separate <p> explaining this behavior, i.e. that the map glyphs are rendered based on zoom and active filters across the whole map.

mfranke93 commented 2 years ago

I would avoid the term "rendered" for everything that is not currently visible. That is not what happens. Rendering happens exactly for those glyphs that are actually visible.

My suggestion would be to, yes, put that in a separate <p>, and to separate out the two parts of the process:

All places are clustered into clusters of evidence.
These clusters are each represented by one glyph. The glyphs are only visible (rendered, if you want to use the term. IMO that is too technical a detail) when they are within the visible area of the map (of course), but their position and content (i.e., also how they look) is predetermined after the clustering is completed, because that is when the aggregation of religions is decided once.

This might be too much detail for visitors, but for your understanding, this is the InfoVis Visualization Pipeline, which is something students learn very early on in our InfoVis lecture. There are multiple steps here, and different processes apply in different parts of the pipeline. What might be relevant here are the last two steps of mapping and rendering. Mapping is where the final representation of the data is decided, and rendering is then only the act of painting that data onto the screen. In InfoVis, we differentiate between geometric zooming and semantic zooming for that exact reason: The former only happens in the last step (the transformation or rendering step), so shapes might get larger or smaller. But semantic zooming can also change the way the data is represented, and therefore also applies to the mapping step. A typical example from the lecture is zooming in an image (geometric), or in google maps (semantic, we see more and different detail). Or in our case: semantic zooming, because the evidence is clustered differently, and the glyphs might use different aggregation levels.

tutebatti commented 2 years ago

Very nice explanation! As you can see, I added a commit with several new phrases. I used to populate to describe what I understand as mapping in the pipeline you shared. So depending on zoom level and active data, the map is populated with glyphs (i.e., "the glyphs are mapped to the map"?).

mfranke93 commented 2 years ago

As you can see, I added a commit with several new phrases.

Yes. Looks good on a brief overview. I left you two comments :)

UniStuttgart-VISUS / damast

Explanation of "Aggregation of religious groups" #77