bigscience-workshop / t-zero

Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)
Apache License 2.0
457 stars 53 forks source link

Support Prefix & prompt upgrades #36

Closed Muennighoff closed 2 years ago

Muennighoff commented 2 years ago

Running

/gpfsscratch/rech/six/commun/experiments/muennighoff/bloomckpt/6b3t0/tr13f-6b3-ml-t0-lmtoks168b-t0toks13b-prefix on super_glue,copa,None,"best_option" resulted in:

With prefix: Result: {'accuracy': 0.54} Without prefix: Result: {'accuracy': 0.53}

Muennighoff commented 2 years ago

cc @haileyschoelkopf @lintangsutawika would be great if you could take a look at the prefixlm config. It's using @haileyschoelkopf transformers branch by directly feeding in the causal mask.

lintangsutawika commented 2 years ago

@Muennighoff What checkpoint are you using? A quick look at this PRs, my understanding is this is a different model to Bloom?

Muennighoff commented 2 years ago

This is using the checkpoint of the bloom model after conversion to transformers. The slurm script is here: https://github.com/bigscience-workshop/bigscience/blob/d98e577e5740304e200aedb74939aff900684d83/evaluation/results/tr13/tzeroeval/evaluate_t0.slurm

stephenbach commented 2 years ago

This repo is for reproducing the T0 paper. If changes are needed for BLOOM evaluation, perhaps they would be better put somewhere else?

Edit: For example, we already decided against unpinning the promptsource version: https://github.com/bigscience-workshop/t-zero/commit/fd057a2fc3f3161491437c31aae91a0dfc93ebf0

Muennighoff commented 2 years ago

Agreed good point. We could make it a fork of t-zero on the bigscience org cc @thomasw21 ?

thomasw21 commented 2 years ago

I don't think we need to fork, especially since it'll never get merged. Essentially we can write our own evaluation script in a new repo.

VictorSanh commented 2 years ago

gonna close that one pr since i understand you are working on another repo