ArneBinder / pytorch-ie

PyTorch-IE: State-of-the-art Information Extraction in PyTorch
MIT License
75 stars 7 forks source link

infer labels in `LabelCountCollector` #351

Closed ArneBinder closed 1 year ago

ArneBinder commented 1 year ago

With this PR, we allow labels = "INFERRED" in which case the labels are inferred from the data. This also adds the parameter label_attribute.

IMPORTANT NOTE: Inferring labels produces wrong results for certain aggregation_functions such as min, mean, and std because documents with zero entries of a certain label are not considered anymore for that label. We remove these from aggregation_functions if labels == "INFERRED", but we can not handle any user defined function (which relies on correct zero values).