GT4SD / multitask_text_and_chemistry_t5

Code for "Unifying Molecular and Textual Representations via Multi-task Language Modelling" @ ICML 2023
https://huggingface.co/spaces/GT4SD/multitask-text-and-chemistry-t5
MIT License
31 stars 1 forks source link

Problem of the weights of text-and-chemistry-t5-base-augm #3

Closed drugilsberg closed 7 months ago

drugilsberg commented 1 year ago

Issue originally opened here: https://github.com/GT4SD/gt4sd-core/issues/229

From Jenonone:

I cannot reproduce the results of the paper through the weights on the huggingface repo:GT4SD/multitask-text-and-chemistry-t5-base-augm ,whose accuracy is supposed to 0.322,but I test it only to be 0.196.

From medicine-wave:

Same question, I cannot reproduce the paragraph2action results of the paper through the weights on the huggingface repo: https://huggingface.co/GT4SD/multitask-text-and-chemistry-t5-small-standard, whose BLEU score is supposed to 0.929, but I test it only to be 0.659. Does the model need to be further finetuned?

christofid commented 1 year ago

Hi, I checked the original results and I verified that are indeed correct. I am investigating right now if the issue is related only to the uploaded checkpoint to HF. I will let you know for that asap.

In the meantime some comments/questions:

  1. I would suggest as a postprocesing step to replace any token of the output with \\\\ (something like this is enough output=output.replace("<unk>","\\\\").strip(). We have noticed that some times the model generates some unk tokens instead of \\ in this specific scenario and this simple postprocessing helps to fix it without create problems to other cases. Yet, this should help only the text2molecule case which is mentioned in GT4SD/gt4sd-core#229

  2. Do you use beam search for the generation?

zw-SIMM commented 1 year ago

Hi, I checked the original results and I verified that are indeed correct. I am investigating right now if the issue is related only to the uploaded checkpoint to HF. I will let you know for that asap.

In the meantime some comments/questions:

  1. I would suggest as a postprocesing step to replace any token of the output with \\\\ (something like this is enough output=output.replace("<unk>","\\\\").strip(). We have noticed that some times the model generates some unk tokens instead of \ in this specific scenario and this simple postprocessing helps to fix it without create problems to other cases. Yet, this should help only the text2molecule case which is mentioned in GT4SD/gt4sd-core#229
  2. Do you use beam search for the generation?

Thanks for your reply!

  1. I tested the paragraph2action task, it may not be related.
  2. I followed the demo code to generate the output with beam number 10, but there is only one output-action generated for each paragraph. By the way, is the result related to the beam search number? Aren't we comparing the performance of the top1 output-action?
test_df = pd.read_csv("hand_annotated_test.csv")
test_paragraphs = list(test_df['paragraphs'])
test_actions_preds = []

for instance in tqdm(test_paragraphs):
    input_text = f"Which actions are described in the following paragraph: {instance}"

    text = tokenizer(input_text, return_tensors="pt").to(device)
    output = model.generate(input_ids=text["input_ids"], max_length=max_length, num_beams=num_beams)
    output = tokenizer.decode(output[0].cpu())

    output = output.split(tokenizer.eos_token)[0]
    output = output.replace(tokenizer.pad_token,"")
    output = output.strip()

    test_actions_preds.append(output)

Looking forward to the corrected checkpoint, tks!

christofid commented 12 months ago

I updated the demo code to include the needed post processing step for the description to smiles case. Thanks for pointing out this! The number of beams in beam search does affect the generated results as the more the number of beams the more the possible token combinations that are examined during generation. We compare top1 but the best output could vary if we use different number of beams. I will let you know for the checkpoint asap.

jannisborn commented 7 months ago

Closing due to inactivity