V2 - Githubissues

Use auto encoder
Hards part to represent: words, characters, tags (variable in number)
Way to encode these into a fixed length vector and then you're done. Fixed length representation -> then use auto encoder
Have a super sparse vector: with words/chars and tags
Huge length vector for user tag, user words, user characters, question tag, question words, queastion charaacters
u1 and q1 will have ID, word IDs, char IDs, tags etc (super sparse vector)
put u1 and q1 into auto encoder
Turn the vector into a smaller dimension representation and then reconstruct it
User auto encoder and question auto encoder
then train user question vectors (encoded) through third neural net
Don't use hand tailored features for now
Tune the neural network well: momentum, restart etc
minimize diff between input and output of autoencoder and decoder simultaneously
TODO: exclude ID field of the auto encoder, in the end append it to the smaller vector we get

smadha / MlTrio