NVIDIA-Merlin / HugeCTR

HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training
Apache License 2.0
905 stars 196 forks source link

[Question] Confused about the additional element of the output of InteractionLayer #408

Closed heroes999 closed 11 months ago

heroes999 commented 11 months ago

The output of InteractionLayer is of shape: batch_size * {num_elems + *(num_feas + 1) (num_feas + 2 ) / 2 - (num_feas + 1)* + 1}, according to code or user guide. num_elems is the dim of fc; (num_feas+1)(num_feas+2)/2 - (num_feas+1) stands for cross product of all embs and fc; So what is the additional element? It seems the additional element doesn't get assigned/calculated in interaction_layer.cu/interaction_layer_test.cpp. https://github.com/NVIDIA-Merlin/HugeCTR/blob/v4.2/test/utest/layers/interaction_layer_test.cpp#L175

Any hint will be appreciated, thanks.

minseokl commented 11 months ago

Hi @heroes999 are you mentioning this internal padding here? https://github.com/NVIDIA-Merlin/HugeCTR/blob/main/HugeCTR/src/layers/interaction_layer.cu#L637

heroes999 commented 11 months ago

I think it really influece the output dim, and this is the link of introduction which tells 1 additional float element: https://nvidia-merlin.github.io/HugeCTR/v4.2/api/hugectr_layer_book.html#interaction-layer image

Hi @heroes999 are you mentioning this internal padding here? https://github.com/NVIDIA-Merlin/HugeCTR/blob/main/HugeCTR/src/layers/interaction_layer.cu#L637

It looks like internal padding for __half, but float32 has one additional float output as well

minseokl commented 11 months ago

The padding was introduced to make the output easily vectorized and TensorCore-friendly for some common use cases back then. Yes it was mainly for __half but float has it as well to make the shape consistent regardless of if you do the mixed precision training or not.

heroes999 commented 11 months ago

Got it. So this additional padding is always 0, right? @minseokl

minseokl commented 11 months ago

Yes it must be zero not to influence the result.

heroes999 commented 11 months ago

Thanks. I'd like to close the issue