jxmorris12 / vec2text

utilities for decoding deep representations (like sentence embeddings) back to text
Other
686 stars 76 forks source link

Access to GTR model #7

Closed phubbard closed 9 months ago

phubbard commented 11 months ago

In the readme and the code, I see that the GTR inverter is by request. I didn't find another way to contact the authors, so could you please let me know how I could get the second inverter? Thank you.

kyriemao commented 10 months ago

Hi Morris, thanks for your great work! I have the same request for the GTR inverter checkpoint. Could you please provide it?

jxmorris12 commented 10 months ago

Just an update on this: the GTR model we trained for the paper is in a more toy setting and has maximum length set to 32. It's trained on Natural Questions, which has mostly text that's over 32 tokens, so it predicts that almost everything is 32 tokens.

I'm currently trying to train a longer-sequence GTR model on MSMARCO to release. If you have compute for this, please contact me! Otherwise, I can train this system (the inversion model and then the corrector) on one of our university GPUs, but I expect it'll take at least a month before I have a model to share.

phubbard commented 10 months ago

Got it. Don’t bother if I’m the only one who’s asked.

jxmorris12 commented 10 months ago

Hi @phubbard -- I trained a single-step inversion model for 128-length GTR embeddings here: https://huggingface.co/jxm/gtr__msmarco__128

We now need compute to use this model to train the correction model. Then I can add the GTR inverter to vec2text so it's easy to use through the API.

As an aside, is anyone looking for the 32-length model we used in the paper? That one would need to be trained separately (although it converges much more quickly).

Finally, here's my relevant comment on the similar issue about SBERT models: https://github.com/jxmorris12/vec2text/issues/5#issuecomment-1787173530

kyriemao commented 10 months ago

Hi @phubbard -- I trained a single-step inversion model for 128-length GTR embeddings here: https://huggingface.co/jxm/gtr__msmarco__128

We now need compute to use this model to train the correction model. Then I can add the GTR inverter to vec2text so it's easy to use through the API.

As an aside, is anyone looking for the 32-length model we used in the paper? That one would need to be trained separately (although it converges much more quickly).

Finally, here's my relevant comment on the similar issue about SBERT models: #5 (comment)

Hi Morris, I need that 32 length GTR model, could you please provide it?

jxmorris12 commented 10 months ago

@kyriemao yep -- that one's much less computationally expensive. I'll train a final version and upload it ASAP. The sequence-length 128 model will take more effort and many more FLOPS...

jxmorris12 commented 10 months ago

Hi @kyriemao -- I uploaded the sequence-length 32 GTR inverter; see "Evaluate the model from the paper" in the README.

kyriemao commented 10 months ago

Hi @kyriemao -- I uploaded the sequence-length 32 GTR inverter; see "Evaluate the model from the paper" in the README.

Thank you very much!

jkurlandski01 commented 9 months ago

When I try to run the code in the "Evaluate the model from the paper" section in the README, I get an error:

AttributeError                            Traceback (most recent call last)
[<ipython-input-18-3e478b6f2c51>](https://localhost:8080/#) in <cell line: 7>()
      5 
      6 # experiment, trainer = analyze_utils.load_experiment_and_trainer_from_pretrained(
----> 7 experiment, trainer = a_utils.load_experiment_and_trainer_from_pretrained(    
      8      "jxm/gtr__nq__32__correct"
      9 )

AttributeError: module 'vec2text.analyze_utils' has no attribute 'load_experiment_and_trainer_from_pretrained'

I'm not sure if this is relevant, but when I call dir(analyze_utils) the result does not include any functions from analyze_utils.py which occur after the definition of load_results_from_folder( ).

['DataArguments',
 'HfArgumentParser',
 'ModelArguments',
 'Optional',
 'TrainingArguments',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'device',
 'experiments',
 'get_last_checkpoint',
 'glob',
 'json',
 'load_experiment_and_trainer',
 'load_results_from_folder',
 'load_trainer',
 'os',
 'pd',
 'shlex',
 'torch',
 'transformers']
jxmorris12 commented 9 months ago

@jkurlandski01 thanks for reporting. I'm guessing you're behind main? Can you try doing a git pull (or new pip install) and let me know if that code still doesn't work?

jkurlandski01 commented 9 months ago

Thanks! I no longer get that error.