IntuitionEngineeringTeam / chars2vec

Character-based word embeddings model based on RNN for handling real world texts
Apache License 2.0
172 stars 37 forks source link

Triplet Loss #1

Open halloTheCoder opened 5 years ago

halloTheCoder commented 5 years ago

Just a query, have you tried triplet loss or lossless triplet loss as I think that would produce better embeddings as we are providing fewer examples and the clusters formed will be visually better. Looking for this feature if it is the right approach.

Also, this is will only handle non-word error and won't work for real word error if error-correction is your aim.

Although, I liked the approach and guess it is similar to fastText from Facebook, here you are using RNN whereas they use n-grams. As I am a beginner considering all this, can you guide me with which method is preferable and reason?

fjsj commented 3 years ago

Check https://github.com/xinyandai/string-embed