Interpretation of results

caimeng2 commented 2 months ago

I am having a difficult time understanding the visualization. Specifically,

partitioner.run() generates a few plots, but not all of them have a title. Is the first red plot just showing the street network?
The second plot "Superblock size rank": If I understand correctly, it is a rank of all the identified Superblocks or areas for potential Superblocks (?). But the size of a block is typically measured in square units, how does one measure size by length in meters? Does the y axis mean perimeter? Does each dot represent a Superblock? According to the plot below, does it mean Okemos has about 210 superblocks? The smallest superblock is about 5 meters in length? That doesn't make sense.
The third plot needs to have a title and a legend to explain the colors and the dots.
What is the difference between "on full graph" and "for the partitioning"? I tried with four cities and feel the two look the same. What does the "graph" in the title represent? And what does a "node" represent in the right column?
"travel distance increase in superblocks." It's "travel distance" from where to where? "increase" from what base? How would you interpret the plot below? Red means more isolated?
Similar to 5, could you please interpret the plot below to help me understand?
In general, the code runs fine, and I can see the package being useful in many cases. For example, aside from the two use cases listed in the paper, will one be able to compare street layouts of different cities using this package? However, I think the documentation could be more user-friendly. It would be great if you could explain the virtualization without jargon. Otherwise, the target audience will need to be "urban planners, researchers in urban studies, data scientists interested in urban data, and policymakers involved in urban development" with knowledge of graph theory.

Reference: https://github.com/openjournals/joss-reviews/issues/6798

cbueth commented 2 months ago

Thank you for your comments and reminders, we have integrated the suggestions 1.-6. in commit 26f0ed02b0c2942713b80e8a189ed546c0ea1deb and improved the Usage page in 02c183e2db6f3706909add5a464cf6a31d76b981. Our responses one-by-one:

1. `partitioner.run()` generates a few plots, but not all of them have a title. Is the first red plot just showing the street network?

Correct guess. We have added plot title f"Street network for {partitioner.name}".

2. The second plot "Superblock size rank": If I understand correctly, it is a rank of all the identified Superblocks or areas for potential Superblocks (?). But the size of a block is typically measured in square units, how does one measure size by length in meters? Does the y axis mean perimeter? Does each dot represent a Superblock? According to the plot below, does it mean Okemos has about 210 superblocks? The smallest superblock is about 5 meters in length? That doesn't make sense.

As specified in the paper, this is about the generated Superblock blueprint which is shown in the following plot. We specified the unit to be "street length (m)" by default. If someone wants to measure the Superblock size in number of edges/streets or nodes/intersections, this is also manually possible, but for urban planners, the total street length is most important. From the geopackage export it is also possible to extract the estimated area using the tessellation.

Screenshot 2024-08-06 at 17-51-40 Usage — superblockify 1 0 0rc10 documentation

Yes, Darmstadt has about 210 Superblocks by this plot. Each one is represented by one black dot. Having some with few meters of streets is due to single streets, e.g. dead ends, connected to the sparse (black) street network. Some of them you can see in your plot given in (5.). As we are working with a given street network, this is a natural effect. Sometimes similarly small (5-100m) Superblocks are single streets connecting two parts of the sparse network, basically being shortcuts for traffic, but are originally urban streets that should not be used for through-traffic.

3. The third plot needs to have a title and a legend to explain the colors and the dots.

For this plot we added a heading and a legend describing the plot's structure.

Screenshot 2024-08-06 at 17-52-00 Usage — superblockify 1 0 0rc10 documentation

4. What is the difference between "on full graph" and "for the partitioning"? I tried with four cities and feel the two look the same. What does the "graph" in the title represent? And what does a "node" represent in the right column?

The street network is represented as a "graph" and "nodes" are the intersections of the streets on the graph. To measure the impact of the generated Superblock blueprint, we need to compare the travel distances (usually time; can also be distance or numbers of passed intersections) before imposing the travel restrictions and afterwards. These two distances between all intersections are shown in the histograms. First the shortest distances on the full graph, later denoted by $d_S(i,j)$, and the distance with the restriction that one cannot cross other Superblocks that don't belong to their start or stop, later $d_N(i,j)$. The histograms are expected to be somewhat similar and would be exactly the same if the restricted distances equal the unrestricted ones. Having only a slight difference means that detouring over the sparse graph does not make a large difference and a good sign. We selected this format of showing the distances, as it is compact and retains the most important information.

5. "travel distance increase in superblocks." It's "travel distance" from where to where? "increase" from what base? How would you interpret the plot below? Red means more isolated?

As described before, we measure $d_S(i,j)$ and $d_N(i,j)$ from everywhere on the graph to every other place on the graph. Then taking the ratio between both $d_N(i,j)/d_S(i,j)$, one knows the factor between the distances (1.05 means a 5% increase of distance/time). Finally this plot shows the mean of this value for each Superblock. A more red Superblock has further distances to the rest of the streets than before. In your example the bottom right Superblock has tghe ghighest increase, because there is no connection between the sparse network in the center south and the restricted distances $d_N(i,j)$ increase, because shortcutting through the neighboring Superblock is not allowed anymore. A similar effect can be seen for the large yellow-ish Superblock. But most importantly, for most Superblocks green indicated that such restrictions by the Superblock blueprint don't affect the shortest path distances as much, or just a slight 5% more travel time.

6. Similar to 5, could you please interpret the plot below to help me understand?

This plot acts similar to (5.) and instead of showing the $d_N(i,j)/d_S(i,j)$ mean for each Superblock, it shows it for each street. This can be important, as $d_N(i,j)/d_S(i,j)$ is not equally distributed inside a Superblock. Large Superblocks may have a larger impact of $d_N(i,j)/d_S(i,j)$ on one side, while another side may not be affected much. This is due to the topological properties of the surrounding sparse network.

7. In general, the code runs fine, and I can see the package being useful in many cases. For example, aside from the two use cases listed in the paper, will one be able to compare street layouts of different cities using this package? However, I think the documentation could be more user-friendly. It would be great if you could explain the virtualization without jargon. Otherwise, the target audience will need to be "urban planners, researchers in urban studies, data scientists interested in urban data, and policymakers involved in urban development" with knowledge of graph theory.

Thank you again for the valuable feedback! We have used this package in a large scale analysis for the thesis as cited in the paper where we apply this tool to 180 cities. A direct feature comparing two cities is not integrated, but one can compare the generated Superblock blueprints and measures between two cities/street layouts, or compare separate algorithms. Looking into the API documentation it is rather simple to add a further approach to generate a blueprint, and the package will calculate all relevant measures.

To improve the accessibility of superblockify we added comments on how to read the default output plots to the Usage page and tried to avoid jargon and added links to other reference pages for the details. We believe this makes the user experience easier.

We hope this addresses your suggestions. A 1.0.0rc11 has been uploaded and we'll respond to #90 presumably tomorrow.

caimeng2 commented 1 month ago

Thank you for the explanation. Now I understand the package much better.

NERDSITU / superblockify

Interpretation of results #93