Closed toliwaga closed 7 years ago
The way things stand, you can write expressions in the csv spec files that access skims like this (from trip_mode_choice.csv):
Walk - Time up to 2 miles,"@skims['DISTWALK'].clip(upper=walkThresh)*2*60/walkSpeed"
The skims object is a wrapper for Skim objects, but the expression skims['DISTWALK']
isn't returning a Skim object. Instead, it is returning a series evaluated in the context of the choosers dataframe with the walk distance for the chooser row based on the orig key from the 'TAZ' column and the dest key from the 'destination' column. the Skims object knows to use those keys because mode_choice called skims.set_keys('TAZ','destination)
and the Skims.__getitem__(key)
adds a call to lookup after retrieving the skim:
def __getitem__(self, key):
return self.lookup(self.skims[key])
Otherwise, the spec file would ahve to do something like
Walk - Time up to 2 miles,"@skims.lookup('DISTWALK').clip(upper=walkThresh)*2*60/walkSpeed"
Apparently, it was deemed more natural to use the square bracket notation in this context, but this comes at the expense of python code in activitysim having to remember that the square bracket operator in python doesn't do what you might expect, when it appears on the RHS (right hand side) of an expression.
In python, to get the skim out of the dict that the Skims object whats, you have to say:
skim = skims.get_skim('DISTANCE')
even though the skim was assigned like this:
skims['DISTANCE'] = skim
It could be argued that the putative benefit of being able to say skims['DISTWALK']
in the csv flies is no great enough, compared to saying skims.lookup('DISTWALK')
and the confusion, as evidenced by the recent bug in skims.py where the distance_skim
injectable erroneously used the sqaure brackets attempting to access the wrapped 'DISTANCE' skim. Even worse, that erroneous usage didn't throw an exception because Skim and Series objects have some functions with identical calling signatures.
So it this mistake is made again, we might not discover it immediately.
made the setter explicit to remove ambiguity. SkimStackWrapper and SkimStackWrapper now implement getitem to permit skim dereferencing in expression file (e.g. skims['DISTWALK'])
We may want to replace implicit df column refs, but for now this ambiguity is eliminated and the way it works is (I think) a little more straightforward.
it is very misleading, confusing, and error prone that the Skims object
__getitem__
and__setitem__
methods are NOT symmetrical.__setitem__(self, key, value)
adds a Skim object to the hash of skimsdef __getitem__(self, key)
does NOT return the corresponding Skim object, but callslookup
using the implicit left_key and right_key df values and returns a SeriesThe fact that
__getitem__
is a synonym forlookup
at first view appears convenient for the readability spec expressions where skims['DISTANCE'] returns NOT the Skim object but the contextualized skim lookup, but this apparent gain in readability comes at the expense of concealing what is really going on behind the scenes.This is especially error prone as both Skim and pandas Series support two-arg
get()
methods, so that a confusion over whether one is dealing with a Skim or a Series is potentially difficult to detect. The dangers of this asymmetry is underlined (sic) by the fact that the skims.py injectablesdistance_skim
,sovam_skim
, etc made this mistake for a while without its being detected, and this code was apparently written by the authors of the Skims class.