Closed tribeband closed 11 months ago
Good Question!
As you mentioned, THOC and other TSAD baselines usually follow unsupervised setting, i.e., no labels included., and also treat all train dataset as normal samples.
Data label what I included is all zeros', (https://github.com/carrtesy/THOC-Pytorch/blob/master/data/NeurIPS-TS/nts_mul_normal.csv) which is just for convenience in implementation, and wanted to write more generalizable code for semi-supervised settings, where a few number of labeled anomlies (may) available.
In dataloader, only train X is utilized (train y is not used) for THOC training: See: https://github.com/carrtesy/THOC-Pytorch/blob/b5863d9a4fc86c2cf1152771472e31f61fd30f9d/exp.py#L66C3-L66C3
Best, Carrtesy
AHA got it. beautiful idea
it is said that THOC does not require the label when training. but it is found that the anomaly data label are still included in the dataset. refer to the following code for details
def load_NeurIPS_TS_MUL(home_dir="."): base_dir = "data/NeurIPS-TS" normal = pd.read_csv(os.path.join(home_dir, base_dir, "nts_mul_normal.csv")) abnormal = pd.read_csv(os.path.join(home_dir, base_dir, "nts_mul_abnormal.csv"))