How to fuse various features for prediction model?
Old method - numpy array concatenate
In my chicago-crime project, I manually maintain a numpy array for each community area (CA) to store their features under one view. Therefore, the numpy array concatenate function is widely used to combine features from different views.
The main assumption for CA is that 1) there is no missing features for any CA, and 2) the CA is indexed with continuous ID. Both assumptions do not hold at tract level.
New method - pandas DataFrame join
The tract ID, which is not continuous, will be used to index each row of the feature vector. The pandas.DataFrame.join function provides easy and robust performance feature fusions.
How to fuse various features for prediction model?
Old method - numpy array concatenate
In my chicago-crime project, I manually maintain a numpy array for each community area (CA) to store their features under one view. Therefore, the numpy array concatenate function is widely used to combine features from different views.
The main assumption for CA is that 1) there is no missing features for any CA, and 2) the CA is indexed with continuous ID. Both assumptions do not hold at tract level.
New method - pandas DataFrame join
The tract ID, which is not continuous, will be used to index each row of the feature vector. The pandas.DataFrame.join function provides easy and robust performance feature fusions.