Closed robinsonkwame closed 1 year ago
Yes, the dataset can have any extra columns as long as their names don't conflict with the columns that hover
uses.
feature_key
, label_key
, "SUBSET"Most of the time you simply won't hit a conflict, just pass your full csv to SupervisableTextDataset.from_pandas()
.
That said, we should be making it more obvious which columns will conflict and suggest the user to change them.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
There are often metadata associated with the feature data; for example, text comes from certain documents. After labeling the raw data it's often useful to merge the labels with the metadata for other data science tasks. For example, some sets of documents or locations might not contain a labels that you would otherwise expect them to. Or you want to aggregate counts by document or location.
Is there a way for SuperisableTextDataset to include
non_feature
data?non_feature
data could store this kind of metadata. The subset row order differs from the raw data frame so you can't just match indices.