UDST / sanfran_urbansim

An UrbanSim for San Francisco: an example implementation of the new framework
39 stars 27 forks source link

Error when running sim.run('rsh_simulate') a second time #3

Closed akselx closed 10 years ago

akselx commented 10 years ago

The run method fails when run a second time. First time it simulates ok, but subsequent times, it complains about NAs. This happens on Windows and Linux.

Running year 2010
Running model 'rsh_simulate'
Found 2904 nas or inf (out of 155507) in column residential_sales_price
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-3-c2cdfeaa5321> in <module>()
     15     "non_residential_developer", # build non-residential buildings
     16     "clear_cache"                # clear the cache each year
---> 17 ], years=[2010])

C:\Anaconda\lib\site-packages\urbansim\sim\simulation.pyc in run(models, years, data_out, out_interval)
   1321                 model = get_model(model_name)
   1322                 t2 = time.time()
-> 1323                 model()
   1324                 print("Time to execute model '{}': {:.2f}s".format(
   1325                       model_name, time.time()-t2))

C:\Anaconda\lib\site-packages\urbansim\sim\simulation.pyc in __call__(self)
    625         with log_start_finish('calling model {!r}'.format(self.name), logger):
    626             kwargs = _collect_injectables(self._arg_list)
--> 627             return self._func(**kwargs)
    628 
    629     def _tables_used(self):

C:\cygwin64\home\aolsen\projects\sanfran_urbansim\models.pyc in rsh_simulate(buildings, zones)
     15 def rsh_simulate(buildings, zones):
     16     return utils.hedonic_simulate("rsh.yaml", buildings, zones,
---> 17                                   "residential_sales_price")
     18 
     19 

C:\cygwin64\home\aolsen\projects\sanfran_urbansim\utils.pyc in hedonic_simulate(cfg, tbl, nodes, out_fname)
    116 def hedonic_simulate(cfg, tbl, nodes, out_fname):
    117     cfg = misc.config(cfg)
--> 118     df = to_frame([tbl, nodes], cfg)
    119     price_or_rent, _ = yaml_to_class(cfg).predict_from_cfg(df, cfg)
    120     tbl.update_col_from_series(out_fname, price_or_rent)

C:\cygwin64\home\aolsen\projects\sanfran_urbansim\utils.pyc in to_frame(tables, cfg, additional_columns)
     93     else:
     94         df = tables[0].to_frame(columns)
---> 95     df = deal_with_nas(df)
     96     return df
     97 

C:\cygwin64\home\aolsen\projects\sanfran_urbansim\utils.pyc in deal_with_nas(df)
     58                   (df_cnt-s_cnt, df_cnt, col)
     59 
---> 60     assert not fail, "NAs were found in dataframe, please fix"
     61     return df
     62 

AssertionError: NAs were found in dataframe, please fix
jrayers commented 10 years ago

@akselx We noticed the same thing here. We temporarily solved the problem by adding a little "fillnas" model that we tacked on to the end of the models list. This code is ugly in many ways, but it does the trick in the short term. This should definitely be replaced with something more robust:

@sim.model("fillnas")
def fillnas():
    for tblname in sim.list_tables():
        print 'Filling NAs in {}'.format(tblname)
        tbl = sim.get_table(tblname)._frame
        tbl.replace([np.inf, -np.inf, np.nan], 0, True)
fscottfoti commented 10 years ago

Aksel - this is fixed now - I just set the prices to zero and filter for residential buildings when getting the "average" price. There are probably other ways to fix it, but this seems fine. @jrayers with your model make sure to only take the average price of residential buildings as this will average in the zeros that used to be nans.