Open mRaffill opened 1 year ago
So the main thing I'm still not sure about is how this deals with people who cross through an intersection but in the perpendicular direction to the project? I tried to make a diagram of this (each color is one unique person, traveling through multiple intersections). It seems like people going through only one intersection because they're "crossing" the project, not traveling through it, would be undercounted. It seems to work when there is only one direction It seems to work for any directions selected, but not for the directions left unselected
^ Each time, the number of people in the un-selected direction is divided by three from what it really is - because those people are actually each only showing up once in the selected intersections, but then they're divided by three to remove double-counting
Basically, it seems like if there are people going in a direction which isn't selected in the project, they aren't actually counted multiple times, so dividing by the average # of intersections traveled will result in them being undercounted.
But is the tool already doing something to take this into account? I seem to remember something about people crossing intersections in different directions or something. It doesn't seem to be in the volume-to-miles code, but maybe in the technical documentation somewhere??
So there is something which estimates some proportion of the volume goes through the intersection vs turns in different directions, which is used to add bicycle volumes to intersections and pedestrian volumes to segments:
Bicycle volumes on roadways (links) and pedestrian crossing volumes at intersections (nodes) are estimated directly from the models of existing active travel (see Section 4). Since bicycling volumes are predicted on roadways (links), bicycling volumes at intersections need to be interpolated. The tool assumes that each bicyclist travels through the adjoining intersections and since turn directions are unknown, it assumes half of the roadway volume is expected to cross through the adjoining intersections (i.e., each bicyclist passes through and is counted on two adjoining roadways).
Pedestrian volume is predicted at intersections (nodes). This prediction is of all intersection crossing volumes (but not right turns since pedestrians do not cross the intersection to turn right). Leaving out right turns at intersections may be appropriate for walking since right turning pedestrians have little traffic exposure risk. However, when interpolating pedestrian volume from intersections to adjoining roadways, the tool assumes all pedestrians use two adjoining roadways and so doubles the volume and distributes that volume equally across adjoining roadways.
(from the technical documentation)
So it looks like there are already some assumptions about how people turn at intersectiond and what proportion of bicyclists/pedestrians travel a certain direction. But I'm not sure whether this is incorporated into the inputs of the volume -> miles calculations?
The volume->miles uses demand, not exposure, which I think means it divides the pedestrian volumes to take into account people crossing an intersection multiple times. But even after that, seems like there should still be different calculations for the proportion of people who is crossing the intersection and staying in the project vs the proportion who only cross through at that one intersection and then travel outside of the project. Maybe it would be possible to use similar assumptions that all pedestrians cross through two of the connected segments, and then split up the volume based on how many of the connected segments are selected in the project?
(I haven't looked at how this works for segments at all yet - come back to that later)
Ok, I found a document from 2022 in the Box folder "Final_Comments_BC_Tool.docx" which I think clarifies this a bit further
I'm struggling to find where exactly this calculation is done (so I can see what data field it outputs to) - the closest thing I could find was:
I do seem to remember seeing this calculation somewhere either in the benefits documentation or the Box at some point. So maybe I just haven't looked in the right place yet.
Anyway, I can probably figure that out later. The issue I am thinking about more has to do with the miles distribution that this calculation uses to remove double-counted people. The distribution includes distances, but not directions. But it seems like it should only be used to divide the fraction of people who move along the same path as the project and are then counted multiple times.
So that would be something like
Concerns:
But I guess people who go in a different direction would also be walking less in the project (eg if you cross at one point on the project and don't travel along it at all, that would be only 1 intersection traveled in the project) so less miles traveled? Unless the miles traveled in that direction are also influenced by this new infrastructure and should be counted somehow?
Trying to plot this out, but I'm getting even more confused now. What if people start in the middle of the project? Wouldn't the number of intersections to travel through the entire project then be smaller? (even without looking at directions) It all makes sense until some people start going past the project boundaries in some way or another, and I'm not sure how to capture all the different ways that could happen.
Maybe use the double and distribute volume - then find what percentage of volume ends up on un-selected ways That would give something like: Intersection 1: 50% exit project Intersection 2: 33% exit project Intersection 3: 25% exit project
So this would give the percentage of people who might exit the project at one specific intersection But there could still be people who travel multiple intersections in the project and then turn in a different direction - so they would still be counted multiple times, just less than expected. It seems like ideally there would be some kind of "exiting project" distribution of what percentage of people travel 1 intersection before turning, 2 intersections before turning, etc which could then be combined with the overall intersections traveled distribution.
The percentage of people who travel 1 specific intersection and leave at the next specific intersection would be: percentage of people who don't turn at intersection 1 * percentage of people who do turn at intersection 2 ... But how could I get that for all of the intersections and then combine them to an overall percentage?
There's probably some math or CS technique to solve this, I just don't know what it would be. Maybe I should look online a bit.
Potential solution:
Add a new weight: proportion of unselected ways/total ways
Multiply average intersections by the new weight and then do everything else the same
Compare long, linear projects (less double-counting) vs dense, connected projects (more double-counting)
Select all intersections/ways that touch the project for demand calculations only
Changes how these parameters are calculated in the backend, but it should hopefully not change the results too much
First, I want to experiment with some more hypothetical/made up situations, to see if this logically makes sense in a few cases where I know the real "answer." I tried this yesterday with two long, linear projects that I made up. I set it up such that that everyone only traveled in a straight line and no one turned in a different direction just to make things simpler to begin with (I do want to try this with people turning in different directions at some point but that starts getting very complicated for this very simple model).
~~I also tried another method where the percentage of selected links at each intersection is multiplied by intersection volume and added up, then only that percentage of the volume is divided by intersections traveled while the remaining people are not divided by anything. Percentage of people “in project”: (⅓)(5)+ (1)(3)+(1)(2)+(1)(2) “Unique”: ((⅓)(5)+(3)+(2)+(2))/3 intersections = 2.88 Other people not double-counted: 12-((⅓)(5)+(3)+(2)+(2)) = 3.35 Total: 2.88 + 3.35 = 6.23 people~~ But this doesn’t make sense because then you would also need the different percentages of people who travel partially outside the project but then partially in the project for 1 intersection, 2 intersections, etc which gets complicated.
The the real average intersections that are traveled within the project boundaries (number of intersections in the project * fraction of people): 1(11/14) + 4(1/14)+3(1/14)+2(1/14) = 10/7 = 1.428 intersections So the “weight” multiplied by the original average should have really been (10/7)/(20/7) = 1/2 (seems like a very nice number - I wonder if that is something important)
Maybe this is because only using the percentage in the project treats the percentage out of the project as 0 intersections. But people crossing through will still travel at least 1 intersection in the project (or else they wouldn’t be counted at all). So maybe multiply that by 1 and add to the average. Percentage of ways in vs out of project: 4/14 Weighted average intersections: 4/14 20/7 + 10/14 1 = 1.530 (real average in-project? 1(11/14) + 4(1/14)+3(1/14)+2(1/14) = 1.428 intersections) Unique people: 13.066 ~ 14 This is closer!
The main issue I see with this is again, people may travel partially outside of the project but then also travel more than 1 intersection in the project... I need to test it with some examples where people do really turn and travel a combination of multiple intersections inside and outside of the project.
Maybe this is because only using the percentage in the project treats the percentage out of the project as 0 intersections. But people crossing through will still travel at least 1 intersection in the project (or else they wouldn’t be counted at all). So maybe multiply that by 1 and add to the average.
Trying for sample 1:
The "real" average traveled inside the project in this case should be: 1(3/6) + 4(1/6)+3(1/6)+2(1/6) = 12/6 = 2 intersections So the “weight” multiplied by the original average should have been: 2/3 (again, very nice round number...?)
With people turning/changing directions (lines showing their paths)
Existing method (direct average):
Method 1 (just multiply by the % of selected links):
Method 2 (multiply average by the % of selected links + 1 * % of deselected links):
The "real" average traveled inside the project in this case should be: 1(8/14) + 4(1/14)+3(1/14)+2(4/14) = 23/14 = 1.643 intersections. Seems fairly close to the result from method 2, just slightly larger because some people are traveling outside of the project but then traveling more than 1 intersection within the project. So the “weight” multiplied by the original average should have been: (23/14)/(20/7) = 0.575 (not as much of a nice round number this time)
For testing with the real project data:
But I can start testing a few projects by manually counting the number of adjacent segments:
1) 64b0406741e08c5dff327a1f Adjacent ways selected: 79/106 = 0.745 (but not including adjacent to intersections that aren't selected) Avg intersections: 5.128 Adjusted avg intersections: 4.077
2) 64921e2f1930d10600997fd9 This one actually doesn't have pedestrian volume for some reason, even though there are some intersections selected??? Anyway I can at still look at the average intersections. Adjacent ways selected: 28/41 = 0.682 Avg intersections: 2.776 Adjusted avg intersections: 2.213
1) 645582371c8b985d1be43a07 Adjacent ways selected: 19/41 = 0.463 Avg intersections: 3.546 Adjusted avg intersections: 2.180
2) 64dfb34bcb0d64389a5ea11e Adjacent ways selected: 14/28 = 0.5 Avg intersections: 3.979 Adjusted avg intersections: 2.489
Weirdly, the long corridor projects I chose have a higher ratio of selected adjacent ways?
Maybe that's because they have fewer turns and dead ends? Or it could be because these are two-way roads so it is more likely to still be in the project when crossing (crossing to the other side of the road is included as a selected way)? Also these big corridors might have fewer full 4-way intersections, and just fewer intersections in general. Maybe it is more likely that people stay on them because they don't really have opportunities to turn anywhere else?
Maybe what I really need to compare is a similar kind of connected street grid with only one road selected vs an entire connected grid-like project. These differences could be affected by other factors.
Eg. there are many intersections like this:
Where the two-way road means there are 5/7 or 4/6 ways selected, while there would be 2/4 on a road represented by only 1 way. That might be causing the higher ratios for the linear projects (and might actually make sense in that case?).
More (better) projects to test: Some of these have 0 intersections selected (so 0 pedestrian volume) but I can at least test what the new adjustment factors would look like, assuming all of the intersections in the project were selected. (Which is more consistent anyway)
651b01899a0c762a2b50accf / 64addb0641e08c5dff327a16 (basically full grid) 49/67 = 0.73 65454061b5bfdcb11540ec4f (part of grid) 31/55 = 0.56
65284187772d22a2108ddc4c (only one road) 44/78 = 0.56 650c95319a0c762a2b50acbd /130 651714169a0c762a2b50acc7 /__
(Fill in these numbers as I count them)
So the main question this is trying to test is whether the simple ratio of selected ways to total ways a good (or reasonable) estimate of how "connected" the project is or how likely people are to stay in the project for most of their travel.
I think it does seem to work for the extreme cases (full grid vs only one road):
I'm not sure how well it works for all the other types of project layouts in between. And are definitely other issues like: 64de9d4fcb0d64389a5ea118 - disconnected/multiple pieces in the project. People might exit the project in one area and then enter again somewhere else in a different part. 65284648772d22a2108ddc4d - two parallel streets, people crossing through one side might also be likely to cross through the other side as well, but just taking the simple ratio of selected/unselected ways wouldn't capture that.
But it seems to make intuitive sense at least as a kind of very basic "chance of leaving the project"?? And it is not supposed to be a very thorough solution anyway so it seems decent given the simplicity.
Summary of requested change:
The idea is something like: $A*(\frac{I_s}{I}) + 1-(\frac{I_s}{I})$ where $A$ is the current calculated average $I_s$ is the number of selected adjacent ways $I$ is the number of total adjacent ways
The rest of the calculations are all the same, just adjust the average intersections traveled based on the ratio of ways selected in the project.
For bicyclists:
Other than that, does it make sense?
Couldn't you also multiply length of segment * number of bicyclists? That would be the total miles traveled within the project? And the same thing for pedestrians after distributing pedestrian volume to segments?
Oh but that wouldn't work because it wouldn't give the number of miles people travel in total (including outside of the project). So yes we do need to get unique people and then multiply by the original miles traveled distribution, which will include travel outside of the project.
Going into more detail about how this would work in combination with the tool automatically selecting the adjacent ways and intersections:
Pedestrian
Bicycle
(A visual I made to mentally process this. Green is auto-selected ways, orange is auto-selected intersections, yellow is original user-selected)
Before going on vacation I had tested a few projects manually with this new adjustment factor. I just finished setting that up and pushing to github now. Some quick notes from testing the first project (may still be weirdnesses) The adjustment factor here seems to have a huge effect on the average intersections traveled, which might be concerning...? Also the average intersections traveled is ending up as less than 1, which means that the weighted volume is greater than the inputted volume. Is this ok? I need to think about this further. Anyway, it seems like the volume -> miles calculation is not fixed yet.
Currently, the demand (number of people at each segment/intersection) is converted to miles traveled in the project using https://github.com/gautama-bharadwaj/volume_to_miles
I'm not really sure how this works and the explanation in the technical documentation is kind of confusing, so I'm going through the calculations slowly and trying to figure out what is going on.
Trying with some random numbers:
say there is an average of 10 intersections per mile (0.1 miles per intersection) and the distribution is 0.2 miles - 2 intersections - 10% of people 0.3 - 3 - 10% 0.4 - 4 - 50% 0.5 - 5 - 20% 0.6 - 6 - 10% So then sum of (percentage of people * number of intersections) would be 0.2+0.3+2+1+0.6 = 4.1 So this looks like it gets the weighted average number of intersections each person walks through, weighted by how likely they would be to walk that far.
say there are 100 people total across all of the intersections 100 people/4.1 intersections ~ 25 Oh, so then this means that because each person traveled an average of 4 intersections, they were counted at 4 intersections in the total volume. That must be what the docs that there are really only 25 "unique" people, each counted an average of 4 times.
Then use this new volume and the distribution of travel to find the total miles traveled. (25 people 10% of people 0.2 miles traveled) + (25 10% 0.3) + (25 50% 0.4) + (25 20% 0.2) + (25 10% 0.6) = 8.75 miles
My questions:
But overall, this seems like it actually does make sense, it just divides the total volume by how many times each person will be counted to remove double-counting, then uses this adjusted volume to find the total miles traveled. I'll try to write a better explanation to add to the technical documentation.