Closed hvgazula closed 1 year ago
Is the snippet in this stack link any better to use?
We are already filtering on glove if glove is included in the align_with
argument. For any LLM encoding, we also want the model_token_is_root
when we are aligning with glove, that's why I added the condition there. Also, I think the current columns are in_glove
and not in_glove50
.
But yeah, you can take care of it since it relates to whatever you are doing on pickling side. Just note that I made some changes in the parser / config file to change the glove50
arguments to glove
to fit it here so we will need to undo all those changes if we want glove50
here
If using HF for static embedding, then we'll go with that. Otherwise do glove50
.
Change this line to glove50 as well: https://github.com/hassonlab/247-pickling/blob/main/scripts/tfsemb_LMBase.py#L57
Changed, but needs testing with newest pickles
Replace this by filtering on
in_glove50
column. @VeritasJoker agree? I'll take care of this clean up but just seeking your thoughts.