Closed linjc16 closed 2 years ago
I have found the previous answer in [https://github.com/microsoft/Graphormer/issues/18](Issue 18), but it still confuses me. Could you please explain it in detail?
If you have k category features, and each of them has up to 512 classes, a natural way to embed them is that using k embedding vectors with 512 vocabulary size. In the meanwhile, it's equivalent if you use just one embedding vector with k*512 vocabulary size. To convert k embeddings to a single embedding, one need to add index offset based on which category it orginally belongs to.
I got it. Thank you!
If you have k category features, and each of them has up to 512 classes, a natural way to embed them is that using k embedding vectors with 512 vocabulary size. In the meanwhile, it's equivalent if you use just one embedding vector with k*512 vocabulary size. To convert k embeddings to a single embedding, one need to add index offset based on which category it orginally belongs to.
Hi. What does the function convert_to_single_emb mean? And why the original features should be added an offset? Thank you.