I am currently working with the RepViT-M2.3 model and I am trying to understand the correct configuration for its depth. Specifically, I want to verify if my implementation of the Multi_Level_Extract class aligns with the model specifications. Here's the code I have so far:
According to my understanding, the depth for the RepViT-M2.3 model should be 34 layers. Here is the configuration part for the SelfAttention class:
class SelfAttention(nn.Module):
def __init__(self, model_type="m2_3", pretrained=True):
super(SelfAttention, self).__init__()
model_config = {
"m2_3": {
"d_model": 640,
"depth": 34,
"heads": 16,
"mlp_dim": 2560,
"model_path": "./model/repvit_m2_3_distill_450e.pth",
"out_channels": [64, 128, 256, 640]
},
# Other configurations...
}
# Rest of the class implementation...
My questions are:
Is the depth of 34 layers correct for the RepViT-M2.3 model? I have seen different sources mentioning varying depths, and I want to make sure my configuration is accurate.
Does my implementation of the Multi_Level_Extract class align with the RepViT-M2.3 model specifications? Is there anything I need to change to better fit the model's architecture?
Hi everyone,
I am currently working with the RepViT-M2.3 model and I am trying to understand the correct configuration for its depth. Specifically, I want to verify if my implementation of the Multi_Level_Extract class aligns with the model specifications. Here's the code I have so far:
According to my understanding, the depth for the RepViT-M2.3 model should be 34 layers. Here is the configuration part for the SelfAttention class:
My questions are:
Is the depth of 34 layers correct for the RepViT-M2.3 model? I have seen different sources mentioning varying depths, and I want to make sure my configuration is accurate. Does my implementation of the Multi_Level_Extract class align with the RepViT-M2.3 model specifications? Is there anything I need to change to better fit the model's architecture?