McGill-NLP / polytropon

MIT License
51 stars 7 forks source link

What examples at the end of code does? #8

Closed puraminy closed 2 years ago

puraminy commented 2 years ago

Hi, I read your paper and checked your code. I don't know what the following code does, could you please provide some explanation? What is the expected output? I decoded the output ant it's the same for all skilled_variants

if __name__ == "__main__":
    from transformers import T5Tokenizer, T5ForConditionalGeneration
    tokenizer = T5Tokenizer.from_pretrained("/home/pouramini/pret/t5-base")
    model = T5ForConditionalGeneration.from_pretrained("/home/pouramini/pret/t5-base")
    inputs = ["Tell me, oh Muse, of that ingenious hero who travelled far and wide after he had sacked the famous town of Troy.",
        "Many cities did he visit, and many were the nations with whose manners and customs he was acquainted."]
    inputs = tokenizer(inputs, return_tensors="pt", padding=True)
    task_ids = torch.LongTensor([0, 1])

    for skilled_variant in ["learned", "hyper", "sparse", "shared", "private"]:
        skilled_model = SkilledMixin(model, n_tasks=2, n_skills=2, skilled_variant=skilled_variant)
        logger.warning("forward %s: %s", skilled_variant, skilled_model.forward(task_ids, labels=inputs["input_ids"], add_prior=True, **inputs))
        hyps = skilled_model.generate(task_ids, **inputs)
        hyps = tokenizer.batch_decode(hyps,skip_special_tokens= True)
        logger.warning("generate %s: %s", skilled_variant, hyps)

Moreover, I expected an input, output (label) pairs... How can I use the model for a multitasking supervised input-output problem?

ducdauge commented 2 years ago

This is just a test to check all model variants are instantiated without errors. They are expected to give the same output before training, as LoRAs are initialised in such a way as to not alter the pre-trained model behaviour.

puraminy commented 2 years ago

Well, but do you have some code of your experiments on supervised tasks? I wonder how they are applied. For example I tried to train a model with your loss function, but the loss didn't decrease during training and the output was also erroneous.