Closed shuai-zhou closed 3 years ago
Hi @shuai-zhou , I've just updated the FE_Panel notebook to show how you can use data in both formats, 'wide' or 'long'. The function accepts both types of data. I hope this helps.
Hi, @pedrovma,
Thank you so much for your prompt response. The "baltimore" data is in itself a cross-sectional dataset, I think the "NAT" data can be a good example in implementing the panel model with year dummy variables. And the "long" data format looks like in the following table. My question is, how can I fit a fixed effects spatial lag model with year dummy variable like: hr ~ rd + ps + year_dum. I compiled the "long" data format for you HERE, feel free to compile the data that you think will do the work. Thanks.
name | fips | fipsno | hr | rd | ps | geometry | year | year_dum |
---|---|---|---|---|---|---|---|---|
Lake of the Woods | 27077 | 27077 | 0.000000 | -0.196536 | -1.462559 | POLYGON ((...)) | 1970 | 1 |
Ferry | 53019 | 53019 | 0.000000 | -0.847856 | -1.697720 | POLYGON ((...)) | 1970 | 1 |
Stevens | 53065 | 53065 | 1.915158 | -0.225283 | -0.591883 | POLYGON ((...)) | 1970 | 1 |
Okanogan | 53047 | 53047 | 1.288643 | -0.391126 | -0.552016 | POLYGON ((...)) | 1970 | 1 |
Pend Oreille | 53051 | 53051 | 0.000000 | -0.451457 | -1.181754 | POLYGON ((...)) | 1970 | 1 |
... | ... | ... | ... | ... | ... | ... | ... | ... |
Lake of the Woods | ... | ... | ... | ... | ... | ... | 1980 | 2 |
Ferry | ... | ... | ... | ... | ... | ... | 1980 | 2 |
Stevens | ... | ... | ... | ... | ... | ... | 1980 | 2 |
Okanogan | ... | ... | ... | ... | ... | ... | 1980 | 2 |
Pend Oreille | ... | ... | ... | ... | ... | ... | 1980 | 2 |
... | ... | ... | ... | ... | ... | ... | ... | ... |
Lake of the Woods | ... | ... | ... | ... | ... | ... | 1990 | 3 |
Ferry | ... | ... | ... | ... | ... | ... | 1990 | 3 |
Stevens | ... | ... | ... | ... | ... | ... | 1990 | 3 |
Okanogan | ... | ... | ... | ... | ... | ... | 1990 | 3 |
Pend Oreille | ... | ... | ... | ... | ... | ... | 1990 | 3 |
Hi @shuai-zhou ,
You can just add the dummies as individual X variables for each of the years in your data (minus 1, the reference category).
Example:
import libpysal
import spreg
import geopandas as gpd
import pandas as pd
data = gpd.read_file('nat_long.shp')
data = pd.get_dummies(data, columns=['year']) #This will create the dummies in the dataframe
y = data[['hr']]
x = data[['rd','ps','year_1980','year_1990']] #year_1970 will be the reference category here
w = libpysal.weights.KNN.from_dataframe(data.iloc[0:3085,:],k=10) #W must still be a NxN matrix.
w.transform = 'r'
fe_lag = spreg.Panel_FE_Lag(y.to_numpy().reshape((data.shape[0],1)), x.to_numpy(),
w, name_y=list(y.columns), name_x=list(x.columns), name_ds="nat_long.shp")
print(fe_lag.summary)
Hi, @pedrovma:
Awesome! This is a great example of implementing spatial panel models with dummy variables. Thanks.
If my understanding is right, the spatial model specification needs the dataset to be in "wide" format, for example, the fixed effects spatial lag model, then how can I add a dummy variable, for instance, a year dummy into the model? Thanks.