This PR follows #94 and #95. After the generative model was fixed, the next step is to fix the issue contained in the word vectors and sentence embeddings. Long story short I had an index error when trying to load sentences for the discriminator model.
e.g.
Sentence: Over-expression in disease
0 - represent the word over-expression, 1 - represents the word disease and 2- represents the word in
This is problematic because 0 and 1 are reserved for null character and unknown token character. This means two important words were getting drowned out for null's and unknowns, which lead to poor discriminative model performance.
Issue is fixed and now the disc model is working. Results will appear in next PR.
This PR follows #94 and #95. After the generative model was fixed, the next step is to fix the issue contained in the word vectors and sentence embeddings. Long story short I had an index error when trying to load sentences for the discriminator model.
e.g. Sentence:
Over-expression in disease
0 - represent the word
over-expression
, 1 - represents the worddisease
and 2- represents the wordin
This is problematic because 0 and 1 are reserved for
null
character andunknown token
character. This means two important words were getting drowned out for null's and unknowns, which lead to poor discriminative model performance.Issue is fixed and now the disc model is working. Results will appear in next PR.