Closed Mayukhdeb closed 2 years ago
Why not just add these functions into MultimodalLM?
This is a good idea, but as we can see, the forward function seen in MultimodalLM
is not really straightforward. So I thought of creating a different wrapper which simplifies things for inference purposes (given the fact that a large fraction of it's users would purely do inference)
Of course one thing that I missed out is the internal generate()
function :sweat_smile: -- will make necessary changes to use that instead.
I don't think it's a good idea to call model.eval() at init, might cause some problems when training
You're right, will put that in as a default arg instead on __init__
as:eval = True
. If someone feels like training it, they can just set eval = False
on init
you can wrap this whole function in a no grad, instead of just doing no grad over the forward pass
Yes, rookie mistake on my side :slightly_smiling_face:
So the to-do is as follows:
[x] remove nn.Module wrapping
[x] use the internal generate()
function
[x] put an eval
arg on __init__
[x] wrap run()
in @torch.no_grad()
[x] use: if not exists(self.checkpoint_path):
Feel free to let me know if you want any changes to the to-do list :eyes:
Contains the following changes:
model
,tokenizer
, andtransforms
are now contained under a unified wrapper:Magma()
which can be used as shown below:magma = Magma( checkpoint_path = 'mp_rank_00_model_states.pt', ## downloads automatically if not present in this path config_path = 'configs/MAGMA_v1.yml', ) magma.to('cuda:0')
Magma()
supports both low level and high level inferencehigh level inference
completion = magma.generate(inputs = inputs, num_tokens = 4, topk = 1)
completion: "A cabin on a lake"