Open URRealHero opened 4 hours ago
@URRealHero
Hi, Thank you for your interest in our work!
For your first question, could you please remind me where hidden_dim = 4096 is set? Sorry that I don’t recall it and thought the dimension was 3072. (I just double checked, the output dimension is 3072.)
Regarding your second question, yes, the default padding side for phi-3.5-v is left. However, as we are an embedding model and won’t be generating new tokens, the padding side should not affect the output.
thx a lot! I reviewed your code, and I found that you directly used classmethod load and build in demo.py and train.py, therefore, the output dimension is as same as the Phi3V's default config 3072. Sorry for that, I asked that because I previously saw:
class MMEBModel(nn.Module):
TRANSFORMER_CLS = AutoModelForCausalLM
def __init__(self,
encoder: PreTrainedModel,
pooling: str = 'cls',
normalize: bool = False,
temperature: float = 1.0,
):
super().__init__()
self.config = encoder.config
self.config.hidden_size = 4096
self.hidden_size = 4096
self.encoder = encoder
self.pooling = pooling
self.normalize = normalize
self.temperature = temperature
self.cross_entropy = nn.CrossEntropyLoss(reduction='mean')
self.is_ddp = dist.is_initialized()
if self.is_ddp:
self.process_rank = dist.get_rank()
self.world_size = dist.get_world_size()
which is not used during the inference or training.
For the second question, thx a lot, and I think it would be similar with both padding sides.
thx for your reply.
Q1 : I'm quite new to this field, I see you set hidden_dim to 4096 which is different from Phi3.5V's original 3072 without training again, wont this modification degrade the performance? Q2: What's more, in your model's build method, padding_side has been set to 'right', but in the modeling_Phi3V code, I found that it use left padding:
What's the meaning of padding right? and why your setting works to generate
looking forward to your reply