hassonlab / 247-encoding

Contains python scripts for performing encoding on 247 data.
0 stars 9 forks source link

Why does do glove embeddings get dropped differently #61

Open zkokaja opened 1 year ago

zkokaja commented 1 year ago

Why is there a special case for glove? If operating on base dfs, it should generalize to any type of model.

https://github.com/hassonlab/247-encoding/blob/03e73d281600e34e8a584025f0895b0e2aa93d69/scripts/tfsenc_read_datum.py#L184-L188

zkokaja commented 1 year ago

Related to #57. We should test whether the else case works for glove then we can remove the special case.

zkokaja commented 1 year ago

Glove puts None for words with no embeddings, so it should work https://github.com/hassonlab/247-pickling/blob/4588392872b7491a8a7a52cee553968ac025722e/scripts/tfsemb_main.py#L603-L607

zkokaja commented 1 year ago

next step: we just need to test whether glove will work with drop_nan_embeddings and we can remove this if condition. once ken figures out the token thing

VeritasJoker commented 1 year ago

I tried drop_nan_embeddings with glove. Got this error:

*** TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Will look into it

zkokaja commented 1 year ago

If x is None this could cause the error

https://github.com/hassonlab/247-encoding/blob/03e73d281600e34e8a584025f0895b0e2aa93d69/scripts/tfsenc_read_datum.py#L22