kuleshov-group / caduceus

Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Apache License 2.0
137 stars 14 forks source link

Customize dimension and layers #35

Closed zyj1729 closed 1 month ago

zyj1729 commented 1 month ago

Hi there, thanks for creating the great model! I'm trying to use the caduceus architecture to fine-tune on my own task. I noticed that the results using pre-trained or random-initialized models are not significantly different. So, I want to try fine-tuning using a Caduceus model with a larger dimension or layer number. I tried to double the parameters but got some inner errors. Instead of tweaking the parameters, I just want to ask first if there is an easy way to initialize a larger Caduceus model? Or simply just initialize a bi-directional mamba architecture with customized parameters? It would be much appreciated if you could provide me with an example. Thanks in advance!

yair-schiff commented 1 month ago

You could perhaps override this method in Caduceus?

To initialize from a pre-trained model, the best thing to do is pass a checkpoint / weights file that contains the state_dict with the model parameters you want to load

zyj1729 commented 1 month ago

I'm talking more about changing the number of bidirectional mamba layers (currently the largest is 16) or increasing the dimensions. The weights could just be randomly initialized as I will train it further. I have tried to increase the dimensions of some of the mamba layers. Here's the original model:

Screenshot 2024-05-14 at 11 10 05 AM

Here's what I changed to:

Screenshot 2024-05-14 at 11 11 06 AM

But I got the following errors:

Screenshot 2024-05-14 at 11 12 16 AM

So I'm just wondering if there is a way I can increase the size of the caduceus dimensions (number of parameters and layer number) without causing errors?

yair-schiff commented 1 month ago

You can change these in the model config: https://github.com/kuleshov-group/caduceus/blob/main/configs/model/caduceus.yaml.

It sounds like the parameters you want to change are n_layer and d_model in that file.

zyj1729 commented 1 month ago

I think it's exactly what I want. It would be really helpful if you could please provide me with the minimal code to initialize a Caduceus model from scratch with specified parameters. Thanks!

yair-schiff commented 1 month ago
from caduceus.configuration_caducues import CaduceusConfig
from caduceus.modeling_caduceus import CaduceusForMaskedLM

config = CaduceusConfig(
    d_model=<TODO: Set your desired model dim>,
    n_layer=<TODO: Set your desired num_layers>,
    ...  # TODO: Set the remaining config params, see for example: https://github.com/kuleshov-group/caduceus/blob/main/configs/model/caduceus.yaml
)

model = CaduceusForMaskedLM(config)