s-nlp / kbqa

0 stars 0 forks source link

added notebooks and scripts to reranking using graph features (catboo… #146

Closed highly0 closed 7 months ago

highly0 commented 8 months ago

I'm adding scripts to run the reranking pipeline using graph features (numerical, textual, and embedding features). There are also jupyter notebooks alternatives. Note: the scripts only train the model without reranking for now. Thus, converting the .ipynb files and running from top to bottom will do all training, reranking, and features importance.

jupyter nbconvert --to script catboost_features.ipynb

Remember to change configurations before running the .ipynb files. Or alternatively, you can train the models with the scripts, then load it in to the ipynb file and gather reranking results.

In addition to the reranking pipeline, I added graph_features_preparation.py which prepares the dataframe with graph features and publish it to HuggingFace (T5-large-ssm, T5-xl-ssm).

The added notebooks and scripts and its functionality are as below: