interaction weights that apply to combinations of choosers and alternatives
automatic joining of interaction terms onto the merged table
non-sampling (all the alternatives available for each chooser)
estimation/simulation support for all combinations
All this should work automatically in MNL models. Note that with non-random sampling of alternatives and small sample sizes, estimated coefficients can be biased unless a correction term is added (see issue #38).
The intention of this PR is to provide general-purpose functionality that can serve as a back end for more specialized tools that automate distance-based sampling, bands, buckets, etc.
I've also done groundwork for the following features that will come later:
availability of alternatives
accepting callables for on-the-fly calculation of weights, availability, and interaction terms
representation of random state, for replicability
Implementation
This required deep enough surgery that the easiest approach was to start fresh rather than drawing on existing code in urbansim.urbanchoice (which did not support weights, availability, non-replacement, or non-sampling use cases).
I've done some basic optimization for things like choosing the most efficient underlying sampling library for each use case (mostly NumPy but sometimes core Python) and drawing single rather than repeated samples whenever possible.
Issue #39 discusses the current performance of the code, and optimizations we might want to look into.
Other changes
LargeMultinomialLogit class now optionally accepts a MergedChoiceTable as input
PR includes unit tests for each table combination, but they could be improved
Coverage increased (+6.6%) to 59.194% when pulling 0b8a2b96a1aca675972eba1502e37dca4b19cbdb on sampling-weights into b3cb2b9496a5c3d11b9a875ec6d4c85246b2b5a8 on master.
This PR adds substantial functionality to the MergedChoiceTable utility.
It's related to Issues #4, #5, and #11, and to UDST/urbansim_templates MNL support.
Features and usage
MergedChoiceTable now supports:
All this should work automatically in MNL models. Note that with non-random sampling of alternatives and small sample sizes, estimated coefficients can be biased unless a correction term is added (see issue #38).
The intention of this PR is to provide general-purpose functionality that can serve as a back end for more specialized tools that automate distance-based sampling, bands, buckets, etc.
I've also done groundwork for the following features that will come later:
Implementation
This required deep enough surgery that the easiest approach was to start fresh rather than drawing on existing code in
urbansim.urbanchoice
(which did not support weights, availability, non-replacement, or non-sampling use cases).I've done some basic optimization for things like choosing the most efficient underlying sampling library for each use case (mostly NumPy but sometimes core Python) and drawing single rather than repeated samples whenever possible.
Issue #39 discusses the current performance of the code, and optimizations we might want to look into.
Other changes
LargeMultinomialLogit
class now optionally accepts aMergedChoiceTable
as inputVersioning