Open zkokaja opened 1 year ago
Replace https://github.com/hassonlab/247-pickling/blob/4588392872b7491a8a7a52cee553968ac025722e/scripts/tfsemb_main.py#L589 with df = pd.DataFrame(index=df.index)
So uh I don't think this is resolved. I am regenerating embeddings now for 798 and found out that while we are saving the emb_df into pickles, we are using to_dict
with "records" here: https://github.com/hassonlab/247-pickling/blob/b7a6fcb060ecb8276b5dcb090b97e6f5b2983558/scripts/tfsemb_main.py#L32
This does not seem to save index, which we need when concatenate more here
But merging on index works in encoding? Should we just use pd.to_pickle
instead?
Yes merging on index works in encoding. I think pd.to_pickle
should work?
Related to #153
see https://github.com/hassonlab/247-encoding/issues/50
In tfsemb_concat, currently:
pd.concat(all_df, ignore_index=True)
, but we don't want to ignore.