Open hanfu opened 4 years ago
initiated embeddings are normalized to (0, stdv). why take sqrt of inverse of feature numbers as std in https://github.com/motefly/DeepGBM/blob/8a38af4d90e680c841edeb7be487a3c110e23d3b/models/deepfm.py#L77?
initiated embeddings are normalized to (0, stdv). why take sqrt of inverse of feature numbers as std in https://github.com/motefly/DeepGBM/blob/8a38af4d90e680c841edeb7be487a3c110e23d3b/models/deepfm.py#L77?