I'm adding scripts to run the reranking pipeline using graph features (numerical, textual, and embedding features). There are also jupyter notebooks alternatives. Note: the scripts only train the model without reranking for now. Thus, converting the .ipynb files and running from top to bottom will do all training, reranking, and features importance.
Remember to change configurations before running the .ipynb files. Or alternatively, you can train the models with the scripts, then load it in to the ipynb file and gather reranking results.
In addition to the reranking pipeline, I added graph_features_preparation.py which prepares the dataframe with graph features and publish it to HuggingFace (T5-large-ssm, T5-xl-ssm).
The added notebooks and scripts and its functionality are as below:
graph_features_preparation.py: prepare the dataset with graph features and publish to HF
linear_regression.ipynb: notebooks to train linear regression and rerank with dataset above (can be ran from top to bottom)
train_linear_regression.py: script to train linear regression using dataset above
catboost_features.ipynb: notebooks to train catboost and rerank with dataset above (can be ran from top to bottom)
train_catboost_regressor.py: script to train catboost using dataset above
I'm adding scripts to run the reranking pipeline using graph features (numerical, textual, and embedding features). There are also jupyter notebooks alternatives. Note: the scripts only train the model without reranking for now. Thus, converting the
.ipynb
files and running from top to bottom will do all training, reranking, and features importance.Remember to change configurations before running the
.ipynb
files. Or alternatively, you can train the models with the scripts, then load it in to theipynb
file and gather reranking results.In addition to the reranking pipeline, I added
graph_features_preparation.py
which prepares the dataframe with graph features and publish it to HuggingFace (T5-large-ssm, T5-xl-ssm).The added notebooks and scripts and its functionality are as below:
graph_features_preparation.py
: prepare the dataset with graph features and publish to HFlinear_regression.ipynb
: notebooks to train linear regression and rerank with dataset above (can be ran from top to bottom)train_linear_regression.py
: script to train linear regression using dataset abovecatboost_features.ipynb
: notebooks to train catboost and rerank with dataset above (can be ran from top to bottom)train_catboost_regressor.py
: script to train catboost using dataset above