issues
search
Modalities
/
modalities
A framework for training multimodal foundation models.
MIT License
39
stars
3
forks
source link
Mamba TODOs
#119
Closed
rrutmann
closed
1 month ago
rrutmann
commented
3 months ago
[x] Define generate function for gpt2 and add generate to Interface
[x] Self.prediction_key / self.sample_key for logits/input_ids in MambaLLM
[x] Fix config
[x] Keep track of where we need default values and where to put them
[x] Pydantic class for ssm_cfg
[x] Make sure that we do not have any redundant variables in different configs
[x] fix cuda_env error when calling generate_text endpoint
[x] fix tests after after changing configs
[ ] How do we save inference_params and reload them?
[ ] Fix casting of incorrect data types (e.g. hidden states in MixerModel.forward()) during generation; current solution might not be the nices
[ ] Add inference_params to generate function
[ ] Evaluate model with eval-harness on downstream tasks
[ ] Prepare presentation
[ ] Run larger experiment on TUD