ActivitySim / activitysim

An Open Platform for Activity-Based Travel Modeling
https://activitysim.github.io
BSD 3-Clause "New" or "Revised" License
191 stars 99 forks source link

Bitwidth expectations #547

Open jpn-- opened 2 years ago

jpn-- commented 2 years ago

The current ActivitySim implementation generally does not explicitly define bitwidths for numerical values, but instead allows numpy to self-select the numerical precision. This generally results in double precision floats almost everywhere that floats are used.

As we migrate to sharrow, we have the burden and opportunity to be more intentional about numerical precision. We can continue to use float64, but switching to float32 in most places can reduce memory usage, with an imperceptible impact on results.

Some care will be required to avoid accidental underflow. For example: in the ARC model, a -100 calibration parameter for utility was added to certain eatout schedule alternatives. This made them effectively unavailable (U+exp(-100)) if any non-penalized alternatives were available. If all the possible schedule alternatives were similarly penalized, then choice tradeoffs could still be made among the penalized alternatives... but only when the utility is float64. When we adopt float32, these terms underflow and evaluate as actually zero, generating errors for the scheduling model.

guyrousseau commented 2 years ago

Thanks Jeff for bringing this up to our attention. I can confirm that within our ARC trip scheduling choice (see attached, rows 81-82) there is a calibration parameter of -100 as you noted. In terms of the rest related to how ActivitySim deals with numerical precision, that’s definitely something to look further into, as we move forward with this project trip_scheduling_choice.csv .

guyrousseau commented 2 years ago

One more note: The way probabilities are resolved in UECs, we use -999 to signal a condition that makes an alternative unavailable. I’ve also seen -99 or other similar large values. Looks like this was ported over to the ActivitySim implementation.

Given the implications of numerical precision, it looks like we’ll need to be either more intentional about the value used, or find another way to indicate that a choice is unavailable. As @jpn-- rightfully indicates, let's be more intentional about when we use single or double precision. Some utilities can be relatively large and still be valid choices. Anyhow, regardless of reasons, this is something we (as Consortium Partners) will need to get in front of, and handle properly, before we can finalize any successful deployment implementation, this year or next.

mnbina commented 2 years ago

From today's meeting: the main decision is whether or not it would be ok to switch to float32. Please add comments here in support or not, additional questions or comments, etc. that would support resolution on this issue. https://github.com/ActivitySim/activitysim/wiki/Project-Meeting-2022.03.31

guyrousseau commented 2 years ago

Thanks @mnbina for posting this, and for an additional reference, see https://stackoverflow.com/questions/43440821/the-real-difference-between-float32-and-float64