jpn-- / larch

Larch: a Python tool for choice modeling
http://larch.newman.me
GNU General Public License v3.0
41 stars 14 forks source link

missing period in "from .dataframes import _check_dataframe_of_dtype" in model.py? #7

Closed alexmitrani closed 2 years ago

alexmitrani commented 5 years ago

Hello again

I've been trying to work through the machine-learning example: https://larch.newman.me/machine_learning.html

The line m.fit(df, y=df.chose)

Produces the following error message:

File "C:\Users\amitrani\AppData\Local\conda\conda\envs\Python 3.7\lib\site-packages\larch\model\model.py", line 91, in fit from .dataframes import _check_dataframe_of_dtype

ModuleNotFoundError: No module named 'larch.model.dataframes'

The line in question is from .dataframes import _check_dataframe_of_dtype

At the top of the same module there is a line from ..dataframes import DataFrames

So, I tried adding a period in front of .dataframes on the line where it was crashing, changing it to: from ..dataframes import _check_dataframe_of_dtype and that did seem to get it past that point.

Please could you check to see if this fix is correct?

Unfortunately, it then crashed again somewhere else:

File "C:\Users\amitrani\AppData\Local\conda\conda\envs\Python 3.7\lib\site-packages\pandas\core\indexing.py", line 1252, in _validate_read_indexer raise KeyError("{} not in index".format(not_found))

KeyError: "['(altnum==5)hhinc', 'altnum==2', 'altnum==3', 'altnum==4', '(altnum==6)hhinc', 'altnum==5', '(altnum==4)hhinc', '(altnum==3)hhinc', '(altnum==2)*hhinc', 'altnum==6'] not in index"

I tried simplifying the model specification and the following ran all the way through:

m.utility_ca = (
    PX('tottime')
    + PX('totcost'))

m.fit(df, y=df.chose)

proba = m.predict_proba(df)
proba.head(10)

So this remaining issue might be a model specification problem, but it isn't clear what is needed to solve the problem.

I tried including the hhinc variable in the index:

df.set_index(['casenum','altnum','hhinc'], inplace=True, drop=False)
m.fit(df, y=df.chose)

... but this did not solve the problem.
Any advice you could offer on how to progress here would be much appreciated.

Thanks

Alex