Watts-Lab / team_comm_tools

An open-source Python library that turns multiparty conversational data into social-science backed features.
https://teamcommtools.seas.upenn.edu/
MIT License
3 stars 5 forks source link

Inputting Dataset with Existing Columns (that conflict with features being generated) #256

Closed rowbotham-evan closed 3 months ago

rowbotham-evan commented 3 months ago

When the featurizer tries to build features using an input dataset that already has those exact feature columns (filled or calculated), it breaks FB.

This is a bug because it forces users the manually delete all the columns before inputting the data and doesn't let users retain features calculations for certain rows (e.g. keeping feature columns 10-15) and recalculating the rest.

Possible Solutions:

  1. Overwrite the columns with new "calculations" for features.
  2. Read in the columns and enumerate over the feature columns that have already be calculated and only compute the remaining "void" columns.