timeseriesAI / tsai

Time series Timeseries Deep Learning Machine Learning Python Pytorch fastai | State-of-the-art Deep Learning library for Time Series and Sequences in Pytorch / fastai
https://timeseriesai.github.io/tsai/
Apache License 2.0
5.1k stars 639 forks source link

Add weights to classification #613

Closed R470R closed 1 year ago

R470R commented 1 year ago

Hello, I am having a few problems adding weights to classification problem of imbalanced dataset, already tried in weights put ex "[0.0182, 0.9818]" or "dls.train.cws"

It doesn't process, with Assertion error or other kind ...

Can you help me?

oguiza commented 1 year ago

Hi @R470R, Could you provide a code snippet and the full traceback? FYI, Pytorch requires weights to be passed as a tensor, in the same device as the model.

R470R commented 1 year ago

Yes @oguiza !

dls.train.cws
TensorCategory([0.0182, 0.9818], device='cuda:0')
from tsai.all import *
tfms  = [None, [Categorize()]]
dsets = TSDatasets(three_d_trended_X, three_d_trended_y, tfms=tfms, splits=splits, inplace=True)
dls   = TSDataLoaders.from_dsets(dsets.train, dsets.valid, bs=[64, 128], batch_tfms=[TSStandardize()], num_workers=0)

batch_tfms = [TSStandardize(by_sample=True)]
learn = TSClassifier(three_d_trended_X, three_d_trended_y, splits=splits, 
                     weights = dls.train.cws, batch_tfms=batch_tfms, metrics=accuracy, 
                     arch=InceptionTimePlus, arch_config=dict(fc_dropout=.5), train_metrics=True)
learn.fit_one_cycle(10)

AssertionError Traceback (most recent call last) Input In [9], in <cell line: 7>() 4 dls = TSDataLoaders.from_dsets(dsets.train, dsets.valid, bs=[64, 128], batch_tfms=[TSStandardize()], num_workers=0) 6 batch_tfms = [TSStandardize(by_sample=True)] ----> 7 learn = TSClassifier(three_d_trended_X, three_d_trended_y, splits=splits, 8 weights = dls.train.cws, batch_tfms=batch_tfms, metrics=accuracy, 9 arch=InceptionTimePlus, arch_config=dict(fc_dropout=.5), train_metrics=True) 10 learn.fit_one_cycle(10)

File ~/anaconda3/envs/rapids-22.02/lib/python3.9/site-packages/tsai/tslearner.py:38, in TSClassifier.init(self, X, y, splits, tfms, inplace, sel_vars, sel_steps, weights, partial_n, train_metrics, bs, batch_size, batch_tfms, shuffle_train, drop_last, num_workers, do_setup, device, arch, arch_config, pretrained, weights_path, exclude_head, cut, init, loss_func, opt_func, lr, metrics, cbs, wd, wd_bn_bias, train_bn, moms, path, model_dir, splitter, verbose) 35 bs = batch_size 37 # DataLoaders ---> 38 dls = get_ts_dls(X, y=y, splits=splits, sel_vars=sel_vars, sel_steps=sel_steps, tfms=tfms, inplace=inplace, 39 path=path, bs=bs, batch_tfms=batch_tfms, num_workers=num_workers, weights=weights, partial_n=partial_n, 40 device=device, shuffle_train=shuffle_train, drop_last=drop_last) 42 if loss_func is None: 43 if hasattr(dls, 'loss_func'): loss_func = dls.loss_func

File ~/anaconda3/envs/rapids-22.02/lib/python3.9/site-packages/tsai/data/core.py:986, in get_ts_dls(X, y, splits, sel_vars, sel_steps, tfms, inplace, path, bs, batch_tfms, num_workers, device, shuffle_train, drop_last, weights, partial_n, sampler, sort, kwargs) 984 dsets = TSDatasets(X, y, splits=splits, sel_vars=sel_vars, sel_steps=sel_steps, tfms=tfms, inplace=inplace) 985 if weights is not None: --> 986 assert len(X) == len(weights) 987 if splits is not None: weights = [weights[split] if i == 0 else None for i,split in enumerate(splits)] # weights only applied to train set 988 dls = TSDataLoaders.from_dsets(dsets.train, dsets.valid, path=path, bs=bs, batch_tfms=batch_tfms, num_workers=num_workers, 989 device=device, shuffle_train=shuffle_train, drop_last=drop_last, weights=weights, 990 partial_n=partial_n, sampler=sampler, sort=sort, kwargs)

AssertionError:

oguiza commented 1 year ago

@R470R , There's misunderstanding here. There are 2 types of weights you can use with tsai. Sample weights or class weights.

In your case you are passing the class weights (dls.cws) as sample weights. This is causing the error.