delftdata / hci-auto-feat

Human-in-the-loop Feature Discovery with AutoFeat
1 stars 1 forks source link

Explain augment dataset #30

Open zegermouw opened 5 months ago

zegermouw commented 5 months ago

Augment dataset has flag explain.

  1. Autofeat computes relationships between n number of tables in the {dataset} repository using {coma, jaccard} similarity with a threshold of x.
  2. Autofeat computes n join_trees.
AndraIonescu commented 5 months ago

The explain text should be the following:

  1. AutoFeat computes the relationships between N tables from the {dataset} repository, using {coma, jaccard} similarity score with a threshold of X (i.e., all the relationships with a similarity < threshold will be discarded).
  2. AutoFeat creates M join trees: the best performing join tree is {ID_number}
    • < print paths with the data quality score >
    • < print the selected features with the relevance and redundancy score >
    • < print ML result: accuracy, model, feature importance >