wri-dssg-omdena / policy-data-analyzer

Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Other
34 stars 9 forks source link

Weights & Biases #87

Closed rsmath closed 3 years ago

rsmath commented 3 years ago

We have used the Weights & Biases tool to facilitate several aspects of the policy incentive classification modeling process

All the data for our project can be found here: https://wandb.ai/ramanshsharma/WRI

Some of the things that Weights & Biases has been used for is

  1. Keeping track of every experiment run on fine-tuning SBERT for classification.
  2. Keeping track of hyperparameters used in each run.
  3. Analyzing automatically generated training/validation accuracy plots, F1 score plots for macro and weighted based on the information we log from the code.
  4. Run automatic hyperparameter tuning using W&B sweeps. For example this one https://wandb.ai/ramanshsharma/WRI/sweeps/fyc8q0fx. We can visually see the combination of hyperparameters that lead to the best results in the model's F1 performance on the validation set in the Parallel coordinates plot.
  5. Save the models on Weights & Biases (keeping in mind the 200 GB limit on stored information on the project) and loading them in the notebook for inference.

Moving forward with more tasks this will be incredibly helpful in organizing large scale experiments with more data and more hyperparameters to tune.

thefirebanks commented 3 years ago

2 final things:

rsmath commented 3 years ago

2 final things:

  • Don't forget to add any new packages you used to requirements.txt please!
  • If we don't use spaCy by default, let's only initialize it in the function evaluate_using_sbert (as opposed to at the top of the loops.py file). That way we don't have to install it in the collab notebook every time :)

I will add the requirements in the txt file. I think you're right with the spacy module being only used in the evaluate_using_sbert function. I will move the installation there so we don't have to install it in the notebook every time.