Test new travel weighting calculations

bicyclingplus / atp-bc-tool-analysis

Analyzing inputs/outputs from the CTC Active Transportation Benefits/Costs tool, to identify and investigate potential issues (analysis notebooks only - input/output data is not included)

0 stars 0 forks source link

Test new travel weighting calculations #7

Open mRaffill opened 7 months ago

mRaffill commented 7 months ago

We just changed travel weighting calculations so that:

Bicycling uses the same equation as walking, instead of just doubling the bicycling volumes
It includes a factor of what proportion of units connected to the project are in the project vs outside of the project, to attempt to take into account people partially traveling outside of the project and then crossing/making a turn onto it

Issues/questions:

The bicycle miles traveled appear very low. Is this a more accurate estimate than before or is it undercounting?
How much of an effect does the second change (the adjacent ways/intersections factor) have? Does this seem like a reasonable representation of people turning/crossing through the project?
Make some more consistent, accurate, automated way (dataset) to verify the changes and any future changes
- (Beyond the scope of this issue)

mRaffill commented 7 months ago

So far, I've tested the modified equations over the data I have from Matt, but without the new adjustment factor (I need to ask Matt for those numbers separately).

Notes and graphs:

BMT is definitely lower than with the previous method. WMT is the same, as expected. But BMT has always been significantly less than WMT, I think mostly because the initial volume/demand numbers are already much higher for pedestrians than bicyclists (look at the scale on the two graphs below).

Overall results (New Weighted Existing Travel Miles is the new equation, Weighted Existing Travel is the old equation, Existing Travel is the inputted demand numbers):	Bicycling:	Walking:

The BMT isn't always that low (this might change with the adjustment factor though). There are still projects with BMT in the 200 - 1500 range which is at least not as low as I had expected.

New BMT and WMT	Bicycling:	Walking:

The ratio between the input (volume/demand) and outputted miles traveled looks (almost) identical between modes. Previously, the ratio for BMT was fixed at two (because the calculation was just to double demand) but it looks like even with the different travel distribution tables for bicycling and walking, the final ratios appear to end up being the same. (not sure yet exactly why). Maybe this means the reason behind the low BMT is more about the original demand numbers being low to begin with, if the amount the equation scales them is the same between modes?

Ratio between inputs and outputs		Bicycling:	Walking:
New
Old

mRaffill commented 7 months ago

I think next I need to ask Matt if he can send the adjacent ways/intersections percentage for each project so I can include that as well. In the meanwhile, I think I'll experiment setting the factor to a range of different numbers (probably with one individual project first - also maybe find a way to run through a bunch of values automatically).

mRaffill commented 7 months ago

I already tested one project a bit while setting up the code and noticed a few other things in all the sub-steps I printed out With respect to

The ratio between the input (volume/demand) and outputted miles traveled looks (almost) identical between modes.

I'm not sure why the ratio turns out almost exactly equal, but what I did notice is that the pedestrians intersections traveled looks like ~1/2 of bicycling intersections traveled, and then the unique people looks like ~2x of bicyclists unique people, so maybe these cancel out to the same ratio in the end?

Example (setting an equal initial volume = 200 for both) Bicycling:

distance per intersection 0.15536037886606563
intersections traveled 7.767662024210243
original volume is 200
unique people is 25.747773187947693

Walking:

distance per intersection 0.15536037886606563
intersections traveled 3.546592792973366
original volume is 200
unique people is 56.39215203849932

These both have the same output = 31.07 miles traveled

mRaffill commented 7 months ago

Actually, reading this again it makes total sense why they're equal. The unique people is just people divided by intersections traveled. And then the unique people is multiplied by total distance, which is just intersections traveled * distance/intersection. I wonder if the equation should use project segments instead of project intersections for bicyclists (because that's where bicyclists are counted), and how that will change the results. That might even be what we said in the requested changes. If so I should update the code and re-test accordingly. Although that may also depend on the changes where we include some adjacent intersections/ways in the project totals.

dtfitch commented 7 months ago

looks like you are definitely working through this. Thanks! I'm not sure I have anything to add now

On Tue, Jan 23, 2024 at 9:45 PM mRaffill @.***> wrote:

Actually, reading this again it makes total sense why they're equal. The unique people is just people divided by intersections traveled. And then the unique people is multiplied by total distance, which is just intersections traveled * distance/intersection. I wonder if the equation should use project segments instead of project intersections for bicyclists (because that's where bicyclists are counted), and how that will change the results. That might even be what we said in the requested changes. If so I should update the code and re-test accordingly. Although that may also depend on the changes where we include some adjacent intersections/ways in the project totals.

— Reply to this email directly, view it on GitHub https://github.com/mRaffill/atp-bc-tool-analysis/issues/7#issuecomment-1907416829, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZ3LZ2KGFJXLY2CSANPP53YQCNYZAVCNFSM6AAAAABCIBIIGCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBXGQYTMOBSHE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Dillon Fitch-Polse 530.601.7624 @.** Co-Director BicyclingPlus* Research Collaborative https://bicyclingplus.ucdavis.edu/ Research Faculty, UC Davis Institute of Transportation Studies

mRaffill commented 7 months ago

I wonder if the equation should use project segments instead of project intersections for bicyclists

Checked again, and yes, this is in fact what we said in the requested changes. The segments count looks like it is definitely different (higher in general) than intersections count for most projects. This ended up changing the results significantly. Almost all of the projects dropped to a consistent and very small BMT, except a few projects which still had a few thousand miles: This looks more similar to the current outputs from the tool (this is from the data Matt sent for Peter) although the scale on this graph is different:

mRaffill commented 7 months ago

Working step-by-step through one example to see how bicycling and walking results differ (the same example as https://github.com/mRaffill/atp-bc-tool-analysis/issues/7#issuecomment-1907413663): First, setting both the existing volumes to be equal, so that I can see what the differences in the equation itself are

Bicycling:

distance per intersection/way 0.10357358591071042
average intersections/ways traveled is 11.651493036315365
adjusted average intersections/ways traveled is 11.651493036315365
original volume is 200
unique people is 17.16518212529846

miles traveled = 20.71

Walking:

distance per intersection/way 0.15536037886606563
average intersections/ways traveled is 3.546592792973366
adjusted average intersections/ways traveled is 3.546592792973366
original volume is 200
unique people is 56.39215203849932

miles traveled = 31.07

The only difference from https://github.com/mRaffill/atp-bc-tool-analysis/issues/7#issuecomment-1907413663 is that now, the distance per unit is lower for bicycling. In general most projects seem to have more segments than intersections, so they have a smaller distance/segment than distance/intersection. As a result, the average units traveled per person is higher and the number of unique people is lower for bicycling.

But then this number of unique people is multiplied by percentage of people miles traveled by that percentage, then summed up. This is basically the same thing as the weighted average intersections/ways that we use to find unique people from total volume, except with miles instead of intersections/ways. And the average miles traveled per person is just the same as miles/unit average number of units traveled per person. (again, basically https://github.com/mRaffill/atp-bc-tool-analysis/issues/7#issuecomment-1907416829)

So the final miles = (total people)/(average units) average units miles/unit So the difference between bicycling and walking caused by the equation (without taking into account differences in existing volume) is the miles/unit being different for segments and intersections.

In this example, yes, the ratio between average units traveled for bicycling and walking and unique people for bicycling and walking cancel out (3.285 and 1/3.285). The only difference is that that miles/unit for walking is 1.5x as large as for bicycling.

mRaffill commented 7 months ago

But a much bigger cause just seems to be that the actual existing volume for this project is 856 for walking and only 6.90591 for bicycling. Taking into account the existing volumes, the miles traveled is 0.72 for bicycling, 133 for walking.

This ratio between existing volume for bicycling and walking is 123.95 - much higher than the 1.5 from the segments/intersections difference.

mRaffill commented 7 months ago

Continuing this example, taking into account the adjustment factor (with constant volume of 200):

I don't have the real adjustment factor (still have to email Matt) but I ran through adjustment factors of 0.1, 0.2, 0.3... to 1.0:

Bicycling: adjustment factor = 0.1 <-- 116.87, 77.1, 57.53, 45.88, 38.15, 32.66, 28.54, 25.35, 22.8, 20.71 --> adjustment factor = 1.0 Walking adjustment factor = 0.1 <-- 87.83, 73.01, 62.47, 54.59, 48.48, 43.59, 39.6, 36.28, 33.48, 31.07 --> adjustment factor = 1.0

(sorry for the confusing layout)

So the results do seem to vary pretty significantly based on the adjustment factor. Looks like this is more of an effect for bicycling than walking, I wonder why?

From counting manually the adjacent segments is 23/52 = 0.44 selected for bicycling, and 14/17 = 0.45 for walking (about the same).

So with this adjustment factor and a fixed constant volume of 200, the miles traveled becomes: Bicycling: 42.26 miles Walking: 51.25 miles

With this adjustment factor and the volume from the tool (6 bicyclist counts and 856 pedestrian counts), the miles traveled becomes: Bicycling: 1.46 miles Walking: 219.37 miles

mRaffill commented 7 months ago

Ok, I think I've gotten enough data and graphs to maybe have a better idea of what's going on:

I think the main reason that bike miles traveled is so low is just because the initial volume is very low, and that this number decreases when converting to miles instead of increasing like it used to

Graphs

bicycling	walking

(note that the pedestrian miles traveled is also significantly less than pedestrian existing volume, the difference is the scale of the existing volume )

With the original equation, bike miles traveled would always be an increase (2x as large) from the existing volume. But with the weighting equation, miles traveled is almost always less than existing volume (for both modes).
So now the already small-ish bike volumes are becoming even smaller when converted to miles, while they used to become somewhat larger

mRaffill commented 7 months ago

Questions/other things I noticed while working through this:

Should the bicycle travel equation use intersections or segments for the distance per unit? The distance per intersection and distance per segment seem to be slightly different because most projects have more segments than intersections (so distance/segment is smaller). And when I switched from intersections to segments the BMT seemed to drop dramatically. I think segments are more accurate to the concept of what the equation is trying to do, but it is probably contributing to the low BMT...

More graphs
Miles per unit for segments and intersections: | segments | intersections | |---|---| | ![image](https://github.com/mRaffill/atp-bc-tool-analysis/assets/53536584/bb571d33-6b26-46d4-9499-87c9d0f7b465)| ![image](https://github.com/mRaffill/atp-bc-tool-analysis/assets/53536584/da2f3b7a-4cad-4f1a-99b4-449bc8574760) | They're both around the same range (3-3.5) but there are a lot of projects with very low average miles per segment, which doesn't seem to be the case for intersections.
When the average units traveled is adjusted, should we use that adjustment when finding the distance traveled by all of the unique people (at the people percentage distance step)? Basically, do we only want to count the miles that are actually within the project boundaries, or the total distance traveled by everyone who passes through the project?
- If the answer is "total distance," then would it make sense to change this part of the code as well? If the distance bin is greater than the project distance (some percentage of people travel farther than the project boundaries) it currently gets cut off:
```
if (float(dist)>proj_distance):
    distance += proj_distance*miles_distribution[dist]*people
else:
    distance += float(dist)*miles_distribution[dist]*people
```
Does all of this sound reasonable conceptually? I think it all makes sense in terms of numbers and I don't think there are any miscalculations. But does it make sense to apply the same overall process to bikes: calculating the average number of segments traveled, adjust based on how many adjacent segments are in the project, divide by how many times each person is counted in the project, multiply by distance traveled? Or are there any other considerations which may require counting bicyclists and pedestrians differently?
And is it reasonable to have relatively low bike miles traveled, given that the existing bicycle volumes are already low?

mRaffill commented 7 months ago

I think these are the main general questions/potential options of what to do about the bike miles traveled:

Do we say that it's OK to have this low bike miles traveled if all of the steps to get there are correct?
Do we change the volume -> miles traveled equation for bicycling, if it seems unreasonable to use the same process as walking for bicycling, or if there are important bicycling-specific factors that might be missing?
Do we look into/try to change the demand model where the existing volume (inputs to this equation) come from, if the existing bicycle volumes themselves seem unreasonably low?

mRaffill commented 7 months ago

The simplest form of the equation should be:

average total miles traveled (counts/(average miles traveled in-project units per mile)) or average total miles traveled (counts miles per unit/average miles traveled in project) (these are both equivalent)

We can find average total miles, counts, and miles per unit, but the main issue is figuring out how much of the total miles traveled is actually within the project on average.

Initial thoughts/brainstorm: What factors need to be taken into account?

Distribution of miles per trip by mode
People may be on the project for any completely unknown proportion of their total trip. They could start off of the project, end off of the project, turn on to the project for a few blocks and then turn in a different direction, stay fully on the project, cross at only one intersection, etc. There is not really any indication of where each person is traveling just based on trip length.
This seems like it depends on the extent/size of the project - the larger the project is and the more directions it includes, it seems like a larger proportion of people's travel would be within the project
- (I guess this is kind of like the "area" it covers - a longer linear project will mean that the people traveling along that road may spend more of their trip in the project, but it won't necessarily capture turning or crossing in other directions)
- But at the same time no matter how large the project is, some of the people counted could have spent most of the trip outside of the project and only started or ended their trip at the very edges of the project.
Maybe some way to turn all of the counted people in the project into a bunch of trips distributed across the project and surrounding area, then find the average in project?
- But this would require figuring out how to convert people counted into a number of unique trips, which seems like the same problem
I don't know if there are existing models of travel behavior or route choice that could help with this?

mRaffill commented 7 months ago

Miles

I tried arranging a few trip paths with different lengths randomly onto a project. I could make the length within the project be really any value less than the total project length or the total trip length. It does seem like it might depend on the project network, how many segments are selected and how many connected but unselected segments it has

If there were fewer side streets connected to the project, then seems like more of the trips would have to stay on the project (because there would be less opportunities for people to turn off or onto it)
If parts of the side streets were also included in the project, then the average miles traveled within the project for these existing trips would increase
If more of the street the project is on was included in the project, then the average miles traveled for these trips would also increase
But selecting more segments/intersections could also add new trips which could have much lower miles traveled within the project

mRaffill commented 7 months ago

But regardless of how many adjacent streets there are and potential routes to turn on to there are, we don't know how far people travel before entering the project. Even if the project is just one road with no side streets connected to it, people could have started their trip anywhere on that road and travel any number of miles on the project. It would change the possibilities of where trips could be distributed but I'm not sure if it can actually help find anything about the distance traveled in the project.

dtfitch commented 7 months ago

The discussion today in the BicyclingPlus meeting centered around the difference between a recreational path where most of trip distances are on the path, versus scattered treatments in a neighborhood where only a small percent of trip distances are on the new infrastructure. Kari suggested first classifying projects and applying a percent of in-project travel that is feasible. I'm thinking this is something for the future tool. For now, I wonder if we should just use a reasonable share like 25% across the board right now. I realize we don't have much to justify that share, but can we at least implement it as a parameter and play around with the share until we feel like we get reasonable numbers. Then we can flag it for update in the next tool.

mRaffill commented 7 months ago

Thanks for the update!

Ok, so with this approach, the process would be

Trip distance distribution * 25% (or whatever number) -> Distribution of distance in-project -> find weighted average distance in project
Then use the equation (sum of (people link length))/(average in-project miles) average total miles to find the total miles traveled
If any of the average miles in-project are greater than the project length, I guess they should be set at the project length, as the tool currently does for the total miles traveled?

Notes/questions:

With average in-project miles per person = 0.25(average total miles per person), the equation will simplify to 4 (sum of people link length). Does that sound reasonable? Or for whatever other fraction, the equation would be the reciprocal of that fraction (sum of people link length).
- Although with adjustments to make miles traveled in project be less than the project length, it wouldn't exactly be 4x. The ratio would end up larger than 4 as the average miles traveled in project is reduced. This is more likely for bicycles because they have longer distances in the distance distribution, which are more likely to be over the project length.
Should the percentage of travel in-project be different between modes? Or given that we don't yet have any justification for this, will it make things too complicated?

dtfitch commented 7 months ago

hmmmm. Suprising that is simplified to a simple multiplier. Lets walk through this today. So if we change the parameter to 0.5, would that be equivalent to doubling the count*link_len, which is what we have been doing right now? Funny if we end right back where we started. If that is true, at least we have a little better conceptual understanding for why we are here.

mRaffill commented 7 months ago

Yeah, if the miles traveled in-project directly depends on the total miles traveled, seems like these cancel out and leave only the constant multiplier. If we say 1/n is the fraction of all travel is happening on the project, then the total travel including outside of the project, would be n times the miles travel in the project.

mRaffill commented 6 months ago

High-level concept bicycle: average total miles (counts link length/factor average total miles less than project length) pedestrian: average total miles (counts average adjacent link length/factor average total miles less than project length) Potentially different bike and ped factor - try 25% bike, 30% walk for now

The general idea of the equation is: average total miles traveled per person * total miles traveled in project/average miles traveled in project per person

Average total miles traveled per person - This is almost exactly the same as the tool currently calculates, a weighted average of % of trips * length for each term of the miles traveled distribution, but without setting any terms in the miles traveled distribution that are greater than the project length to be equal to the project length. This is a constant for all projects- about 0.551 miles for pedestrians, and about 1.941 miles for bicyclists.
Total miles traveled in the project - For each way, multiply the bicycle demand count by the length of the way (the distance that each person counted on that way would travel), then sum across all ways. For each intersection, multiply the pedestrian demand by the average length of adjacent selected ways (the average distance that each person counted at that intersection would travel, within the project), then sum across all intersections.
Average miles traveled in project per person - Also similar to the average miles traveled that the tool currently calculates, but first multiplying each term in the miles traveled distribution by a constant factor (currently experimenting with 25% for bike and 30% for walk). Then if this length is greater than the project length, set it equal to the project length, as the tool currently does. Then calculate the weighted average of this modified miles traveled distribution, the sum of the % of trips * the new adjusted length.

mRaffill commented 6 months ago

This is the code I used to try testing this with the project data I have:

walk_average = 0
bicycle_average = 0

for dist in walk_distribution:
    walk_average += walk_distribution[dist]*float(dist)
for dist in bicycle_distribution:
    bicycle_average += bicycle_distribution[dist]*float(dist)

def weighted_miles_bike(project_id):
    segments = segments_n[segments_n["Project ID"]==project_id]
    proj_length = reach[reach["Project ID"] == project_id][reach["Type"]=="network"]["Total length of segments"]
    bicycle_average_in_proj = 0
    total_miles = (segments["Bicycle demand"] * segments["Length"]/5280).sum()
    for dist in bicycle_distribution:
        in_project = float(dist)*0.25
        if (in_project > float(proj_length)/5280):
            bicycle_average_in_proj += bicycle_distribution[dist]*float(proj_length)/5280
        else:
            bicycle_average_in_proj += bicycle_distribution[dist]*in_project  
    weighted_miles = bicycle_average * total_miles/bicycle_average_in_proj
    return(weighted_miles)

def weighted_miles_pedestrian(project_id):
    intersections = intersections_n[segments_n["Project ID"]==project_id]
    proj_length = reach[reach["Project ID"] == project_id][reach["Type"]=="network"]["Total length of segments"]
    walk_average_in_proj = 0
    total_miles = (intersections["Pedestrian demand"] * (adjacent_selected_ways_average_length/5280)).sum()
    for dist in walk_distribution:
        in_project = float(dist)*0.30
        if (in_project > float(proj_length)/5280):
            walk_average_in_proj += walk_distribution[dist]*float(proj_length)/5280
        else:
            walk_average_in_proj += walk_distribution[dist]*in_project
    print("unique people:",total_miles/walk_average_in_proj)
    weighted_miles = walk_average * total_miles/walk_average_in_proj
    return(weighted_miles)

(I don't actually have the length of adjacent ways for each intersection so "adjacent_selected_ways_average_length" is just a placeholder)

mRaffill commented 6 months ago

What is the difference from what the tool previously did? How much did the outputs change? Why did they change so much?

Original:

bike: total miles traveled in project * 2
- (where total miles = way demand * way length summed across all ways)
pedestrian: (total demand average way length)/(weighted average miles traveled/average way length less than project length) (weighted average miles traveled/average way length less than project length) = total demand * average way length
- (where average way length = total project length/number of intersections)
- less than project = any terms in the distribution greater than the project length are set equal to the project length

Iteration 2 (January):

bike: (total miles traveled in project average way length)/((weighted average miles traveled/average way length less than project in_project) + 1-in_project) * (weighted average miles traveled/average way length less than project)
- note: I now notice that the existing (unweighted) demand is already the sum of way length * demand for all ways for bicyclists (which is why I wrote it as total miles traveled in project, not total demand). So multiplying again by average way length was redundant and maybe another reason why the bike miles traveled became so low in this January update.
- (where average way length = total project length/number of ways, and in_project = total selected segments/(total selected + adjacent segments)
pedestrian: (total demand average way length)/((weighted average miles traveled/average way length less than project in_project) + 1-in_project) * (weighted average miles traveled/average way length less than project)
- (where average way length = total project length/number of intersections)

Current version (March):

bike: total miles traveled in project/(weighted average miles traveled 0.25 less than project length) weighted average miles traveled
- (where total miles = way demand * way length summed across all ways)
- the output should always be greater than the total miles traveled in project, because weighted average miles traveled > weighted average miles traveled * 0.2 within project length, so the ratio should always be greater than 1
pedestrian: total miles traveled in project/(weighted average miles traveled 0.3 less than project length) weighted average miles traveled
- (where total miles = intersection demand * average adjacent way length summed across all intersections)
- again, the output should always be greater than the total miles traveled in project, because weighted average miles traveled > weighted average miles traveled * 0.3 less than project length, so the ratio should always be greater than 1

Differences (January vs. March):

For bikes - total miles (sum of demand * way length) was multiplied again by the average way length in the Jan version
- effect: Removing this extra multiplication should have increased BMT because the average way length is almost always less than 1.
For pedestrians - total miles is the sum of volume average adjacent way length for all intersections, instead of the sum of volume for all intersections (project length/number of intersections)
- effect: I think I need to look at some specific projects, but given that many projects have more ways than intersections, the average actual way length may be smaller than the total project length divided by the number of intersections. So in that case, this probably decreased WMT (not sure how significant of a decrease though).
For both modes: removed the in_project adjustments, replaced with a constant multiple instead (0.25 or 0.3)
- So the denominator was originally (weighted average miles traveled/average way length less than project in_project) + 1-in_project, now it is weighted average miles traveled 0.25 or 0.3 less than project
- The original expression is pretty complicated so I'm not sure exactly how this change might affect the outputs. I think I need to try it with an example first.
For both modes: made the weighted miles traveled in the numerator not be limited by miles traveled
- Effect: for projects with a project length less than 3.125 miles for bicycles, or 0.54 miles for pedestrians, the denominator weighted miles traveled should become smaller than the numerator, so this should increase miles traveled for both modes.

The pedestrian miles traveled has clearly decreased a lot in the most recent iteration from both the original tool outputs and the January modified outputs. The change in the average way length used for pedestrian calculations, and the change in how miles traveled in-project are calculated seem like the only factors that might cause a decrease, everything else seems like it should cause an increase in miles traveled. I probably need to try stepping through the calculations for a few sample projects to see what is really changing.

mRaffill commented 6 months ago

So I tried doing the calculations step-by-step for both versions of the calculation for one sample project.... but weirdly, I got different results than the tool outputted? I got higher pedestrian miles and lower bike miles compared to the tool in both cases (bike seems to be much closer though).

The only difference I could think of is that I haven't added in the volume on the adjacent ways to intersections or adjacent intersections to ways. That might increase the volumes, which could explain higher bike miles, and might also change the adjustment factor for the January version. Or it could be that there was some miscommunication in the implementation for the January version, but the March version got the expected output for the unit test I made so why does it not work here? Or I guess I must have made a mistake in my calculations somewhere?

January Pedestrian

distance per intersection/way 0.1665484961647727
average intersections/ways traveled is 3.3083456932260438
adjusted average intersections/ways traveled is 1.9646817822437197
original volume is 7452.0
unique people is 2252.4852875134047
adjusted unique people is 3792.9806584197136

2089.93

actual tool output (WMT): 636.603025

Bicycle

distance per intersection/way 0.07137792692775974
average intersections/ways traveled is 16.256805203524873
adjusted average intersections/ways traveled is 7.375978294010395
original volume is 15.489016
unique people is 0.9527712121839045
adjusted unique people is 2.099927003930819

2.44

actual tool output (BMT): 3.446937 March Pedestrian

total counted people: 7457.692649882349
total miles: 423.0851677714977
unique people: 2559.4988975892184
average miles in-project: 0.16529999999999997
average miles total: 0.5510000000000002

1410.2838925716596

actual tool output (WMT): 153.306127 Bicycle

total counted people: 227.15328890927412
total miles: 16.67117838724351
unique people: 38.368765776231434
average miles in-project: 0.43449868792941265
average miles total: 0.43449868792941265

74.46418218022117

actual tool output (BMT):128.922906

mRaffill commented 6 months ago

Regardless, the pedestrian miles traveled did decrease between these iterations even with my calculation results. I think the reason may be something like:

The Jan version uses distance/unit as the total project length/number of intersections, while the March version uses average adjacent segment length. It turns out that the Jan version is much larger for this project. The average distance/unit is ~ 0.1665 miles for Jan and ~ 0.0567 miles for March - Jan is about 3x as large as March.
In the Jan version, adjusted average units traveled in-project is about 2/3 of the original average units traveled. The March version adjusts the average miles traveled per person by 0.3 (about 1/3). So the miles traveled in-project is about 2x as large in the Jan version than the March version, which would decrease total miles traveled in the Jan version compared to the March version.
- And projects with a smaller in-project distance in the Jan version than the March version would have even larger miles traveled in the Jan version. I would guess this would mostly be projects in places with very interconnected street networks and many ways/intersections adjacent to the project that are unselected.

Both equations could be simplified to approximately (volume (distance/unit) / average distance in-project) average distance total Ignoring the terms that are the same in both, the Jan version is (3x larger distance per unit/2x larger distance in-project), so it should end up having about 3/2 larger miles traveled. Which is about what happened (in my calculations, not the tool outputs), the March result (~1410) is very close to 2/3 of the Jan result (2089).

Projects where the # of intersections is much smaller than the # of segments would be likely to have a higher distance/unit when it is calculated as total distance/number of intersections compared to the average segment length. Which looks like a lot of projects:

mRaffill commented 6 months ago

I think this current calculation with the average distance of adjacent ways instead of total project distance/number of intersections is much more accurate. It makes more sense that people counted at one intersection are going to travel at least the distance of one of the ways connecting to the intersection, and not necessarily travel the distance between one intersection and the next (when there are more unselected intersections in between). I'm not sure about the in-project ratio, I don't really have a way to calculate that for every project so I can't really see how much of an impact that had. But I don't think the new estimate for in-project is more unreasonable than the Jan version (and it is definitely way simpler).

So basically I think it is fine and probably makes sense that the pedestrian miles decreased, the main reasons I can see are changes in the way we are calculating average distance/unit and distance traveled in-project.

What is way more concerning to me is that I am getting very different results from the tool outputs for some reason?? The change in miles traveled for pedestrians also looks much more dramatic in the tool outputs than in my calculations. (the change for bicycles is still pretty dramatic)

mRaffill commented 6 months ago

I tried looking at the crash outputs as well for this example project, to see if it's only the miles traveled that are different or if it is all of the calculations. They are at least on the same order of magnitude (around 1-10) but I still don't seem to get the same result as the tool outputs. I'm also not really sure what the column titles mean, so it may not be the same calculation as i think it is. The columns I am looking at are "BIKE Saftey: combined Crashes" and "Walk Saftey: combined Crashes". Does "combined" here mean something about the combined mode, or does it mean intersection crashes and roadway crashes combined together? This project has user-inputted crashes for 11 years (so it would not use the split user/model calculation)

The March output has "BIKE Saftey: combined Crashes": 4.801145 "Walk Saftey: combined Crashes" 1.354022

If this means the existing crashes, then the numbers I have (just directly from the user input) are

Bike crashes (user input but not divided by the number of years, otherwise it becomes way too small) = 3 Walk crashes = 6 Or Combined crashes, intersection = 2 Combined crashes, roadway = 7

It may also be the new crashes or change in crashes. I just realized that I hadn't included the infrastructure travel increases when calculating the new crashes, so I'll re-calculate that and then compare the results.

It would probably help to know what values these columns are actually showing. And hopefully once Matt is able to compare against the unit tests, that will help us figure out if there's a difference in the calculations/what it is.

mRaffill commented 6 months ago

While I was trying to figure this out, I finally realized that the "BMT (Total)"/"WMT (Total)" numbers are probably the change in miles traveled and the "BMT (Per Capita/ Projected+Increase)"/"WMT (Per Capita/ Projected+Increase)" are the existing miles traveled + change in miles traveled.

bike miles traveled	walk miles traveled
March outputs: projected + increase - total	74.464182	1350.049949
my calculation	74.46418218022117	1410.2838925716596

WMT is still slightly lower but this definitely looks very close. This makes so much more sense now!

However, I didn't include the volume from adjacent ways/intersections in my calculation (don't have that data in the debug tables), so I would actually expect the tool outputs to be higher than what I calculated. But at least I know the entire calculation isn't completely off, which is a big relief!

mRaffill commented 6 months ago

Here's what that looks like across all of the projects:

bike miles traveled	walk miles traveled
March
January

(note: "none" is the existing miles traveled) So yes, the change in WMT is fairly small (in both versions), but the total existing WMT is still decently large in most projects. Given that the tool outputs do seem to line up with my calculations, what I said earlier about the decrease in WMT being primarily a result of the new unit length calculation is probably still true. Looks like BMT is substantially larger in the March version -- I might want to look into why that happened as well.

The change in WMT is also smaller, but it is the same in proportion to the smaller existing WMT. The ratio between change in WMT and existing WMT is equal in the Jan and March versions (this is just based on the infrastructure percentages which didn't change at all): This is also true for the change in BMT (except for some projects which seem to have missing data in the Jan version - maybe newly inputted projects?): (orange is Jan, blue is March)

mRaffill commented 5 months ago

Since I don't think there is anything urgent pending right now, I'm going to wrap up some comments from earlier....

https://github.com/mRaffill/atp-bc-tool-analysis/issues/7#issuecomment-1992880472

I tried looking at the crash outputs as well for this example project, to see if it's only the miles traveled that are different or if it is all of the calculations. They are at least on the same order of magnitude (around 1-10) but I still don't seem to get the same result as the tool outputs. I'm also not really sure what the column titles mean, so it may not be the same calculation as i think it is.

Turns out that these are the change in crashes, over the 20 year project timeframe with the 4% discount rate. Taking that 20 year timeframe into account, I now get the same results as the tool for bicycling and walking! :D

My calculation	Most recent tool output (March 26th)
Pedestrian	1.3540210887408997	1.354022
Bicycle	4.80114603733721	4.801145

But I couldn't figure out the combined crashes column. The tool output is 0.680857. I believe this is combined crash reduction + bicycle crash reduction + pedestrian crash reduction, so the combined crash reduction only = -5.47431.

Attempts so far:

apply the volume increase ratios from both bicycling and walking to the combined existing crashes, then apply crash reduction factor 1.4995996586013516
apply the volume increase ratios separately to the existing bicycling crashes and walking crashes, then add them together to get combined crashes, then apply crash reduction factor -1.6413245929521587
don't apply any volume increases (since there is no infrastructure with "combined mode" travel increase), then apply crash reduction factor -5.828343205075886

This last one is the closest so far, but still slightly different than the tool output. If the tool is currently doing something along these lines, that means the combined crashes don't include the bicycle and pedestrian volume increases. I don't think that's what is supposed to happen. Based on the Benefits Calculations notes for the original equations:

For the combined mode, we sum the projected volume for bicycling and the projected volume for walking.

The new equations no longer have the step of calculating projected volume separately, they just multiply by the percentage increase in volume instead. But for combined mode, there isn't one direct percent increase in volume so it would be a bit more complicated. I realize that I didn't specify at all what to do about the combined mode in the requested changes, and I don't think I've thought about how to deal with it yet either.

To do next:

need to confirm what calculation the tool is actually using for combined mode right now
if it is true that the combined mode isn't including the volume increases, I should figure out how to rewrite the new equation to fix that, then update the requested changes
I should also go back and check the unit tests for the crashes and check what I had calculated for the combined mode. I don't think I fully understood how the combined mode calculations worked then, and I may have calculated it in some other (probably incorrect) way.

dtfitch commented 5 months ago

Looks good! I forgot about the combined mode issue too. I liked your idea in a prior string about making the new version of this something like this: bike = bike + combined walk = walk + combined

This seems intuitive to me. It simplifies the calculations and is easier to understand. I can't remember why we didn't do this from the start. Is there a reason you can think of for not doing it this way?

On Fri, Apr 12, 2024 at 1:55 AM mRaffill @.***> wrote:

Since I don't think there is anything urgent pending right now, I'm going to wrap up some comments from earlier....

7 (comment)

https://github.com/mRaffill/atp-bc-tool-analysis/issues/7#issuecomment-1992880472

I tried looking at the crash outputs as well for this example project, to see if it's only the miles traveled that are different or if it is all of the calculations. They are at least on the same order of magnitude (around 1-10) but I still don't seem to get the same result as the tool outputs. I'm also not really sure what the column titles mean, so it may not be the same calculation as i think it is.

Turns out that these are the change in crashes, over the 20 year project timeframe with the 4% discount rate. Taking that 20 year timeframe into account, I now get the same results as the tool for bicycling and walking! :D My calculation Most recent tool output (March 26th) Pedestrian 1.3540210887408997 1.354022 Bicycle 4.80114603733721 4.801145

But I couldn't figure out the combined crashes column. The tool output is 0.680857. I believe this is combined crash reduction + bicycle crash reduction + pedestrian crash reduction, so the combined crash reduction only = -5.47431.

Attempts so far:

apply the volume increase ratios from both bicycling and walking to the combined existing crashes, then apply crash reduction factor 1.4995996586013516

apply the volume increase ratios separately to the existing bicycling crashes and walking crashes, then add them together to get combined crashes, then apply crash reduction factor -1.6413245929521587

don't apply any volume increases (since there is no infrastructure with "combined mode" travel increase), then apply crash reduction factor -5.828343205075886

This last one is the closest so far, but still slightly different than the tool output. If the tool is currently doing something along these lines, that means the combined crashes don't include the bicycle and pedestrian volume increases. I don't think that's what is supposed to happen. Based on the Benefits Calculations notes for the original equations:

For the combined mode, we sum the projected volume for bicycling and the projected volume for walking.

The new equations no longer have the step of calculating projected volume separately, they just multiply by the percentage increase in volume instead. But for combined mode, there isn't one direct percent increase in volume so it would be a bit more complicated. I realize that I didn't specify at all what to do about the combined mode in the requested changes, and I don't think I've thought about how to deal with it yet either.

To do next:

need to confirm what calculation the tool is actually using for combined mode right now

if it is true that the combined mode isn't including the volume increases, I should figure out how to rewrite the new equation to fix that, then update the requested changes

I should also go back and check the unit tests for the crashes and check what I had calculated for the combined mode. I don't think I fully understood how the combined mode calculations worked then, and I may have calculated it in some other (probably incorrect) way.

— Reply to this email directly, view it on GitHub https://github.com/mRaffill/atp-bc-tool-analysis/issues/7#issuecomment-2051327211, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZ3LZ2SO2P35OWHF5C5OALY46OQRAVCNFSM6AAAAABCIBIIGCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJRGMZDOMRRGE . You are receiving this because you commented.Message ID: @.***>

-- Dillon Fitch-Polse 530.601.7624 @.** Co-Director BicyclingPlus* Research Collaborative https://bicyclingplus.ucdavis.edu/ Research Faculty, UC Davis Institute of Transportation Studies

mRaffill commented 4 months ago

Hmm, I'm not sure what the reasoning behind the current implementation is. It was already like this in the original calculations, before any of the recent changes.

Maybe an issue is that it might be confusing to "double count" the combined benefits for both bicycling and walking? If the combined mode reduction in crashes is a combination of some reduction in bicycle crashes and some reduction in pedestrian crashes, then it might not make sense sense to add the total combined reduction to both the bicycle and pedestrian columns. It seems like would be saying the combined benefits were doubled?

If the project infrastructure prevents 5 bike crashes, 5 pedestrian crashes, and 5 "combined" crashes, then those "combined" crashes could be split something like 3 bike crashes + 2 pedestrian crashes prevented (but we don't actually know what that split is). In that case it seems potentially confusing to say the reduction in bike crashes is 5 bike crashes + 5 combined crashes and the reduction in pedestrian crashes is also 5 pedestrian crashes + 5 combined crashes, when only part of the combined crash reduction is from each mode.

This may be the wrong way to think about it though. If the combined crash reduction represents that the infrastructure with "combined" benefits are reducing crashes equally for both modes, then I think it does make more sense to say bike = bike + combined and so on.

After we decide, I think it may also be helpful to eventually add a note or something to the tool UI to clarify what the combined mode means, because all of the options seem like they could be confusing and probably not intuitive to everyone!