timeseriesAI / tsai

Time series Timeseries Deep Learning Machine Learning Python Pytorch fastai | State-of-the-art Deep Learning library for Time Series and Sequences in Pytorch / fastai
https://timeseriesai.github.io/tsai/
Apache License 2.0
4.95k stars 625 forks source link

cannot use Learner.get_X_preds #753

Closed zhaosiyuan1098 closed 1 year ago

zhaosiyuan1098 commented 1 year ago

Hello, I'm a beginner with tsai .

I have successfully trained a time series classification model using tsai, and here is my code:

    splits = get_splits(y, valid_size=.2,test_size=0.1, stratify=True, random_state=23, shuffle=True)
    tfms  = [None, [Categorize()]]
    x_dsets = TSDatasets(x_3d, y, tfms=tfms, splits=splits, inplace=True)
    batch_tfms=[TSStandardize(),TSNormalize()]
    bs=64
    x_dls   = TSDataLoaders.from_dsets(x_dsets.train, x_dsets.valid, bs=[bs, bs*2],batch_tfms=batch_tfms)
    x_model = build_ts_model(XceptionTime, dls=x_dls)
    learn = Learner(fft_dls, fft_model, metrics=[accuracy, RocAuc()])
    learn.fit_one_cycle(100, 1e-3)
    learn.save_all(path='models', dls_fname='x_dls', model_fname='x_model', learner_fname='x_learner')

However, when I use the Learner.get_X_preds to predict the classification results, I encounter some issues:

    x_learn = load_learner_all(path='models', dls_fname='x_dls', model_fname='x_model', learner_fname='x_learner')
    dls = x_learn.dls
    valid_dl = dls.valid
    valid_probas, valid_targets, valid_preds = x_learn.get_preds(dl=valid_dl, with_decoded=True)
    print("x model accuracy=    "+str((valid_targets == valid_preds).float().mean()))

I get satisfactory results,most valid_targets equal the valid_preds and the valid_probas looks reasonable.

there are 12 kinds of true lables for` X_test`,but the predict lable of the trained model is always the same.Besides,`x_loss` is  very big.

I've consulted the tsai documentation and found that the official [Learner.get_X_preds](https://timeseriesai.github.io/tsai/inference.html) seems to have this issue :

The labels of `x` have four types: 0, 1, 2, and 3, but the predicted results is always type 1, and the probabilities for each class are almost equal:

(tensor([[0.2632, 0.2575, 0.2431, 0.2362], [0.2632, 0.2575, 0.2431, 0.2363], [0.2631, 0.2575, 0.2431, 0.2363], [0.2631, 0.2575, 0.2431, 0.2363], [0.2632, 0.2575, 0.2431, 0.2363], [0.2632, 0.2575, 0.2431, 0.2362], [0.2632, 0.2575, 0.2431, 0.2362], [0.2631, 0.2575, 0.2432, 0.2362],


May I ask why there is a situation that looks like a classification failure and how can I modify my code to fix it?

My computer environment is:

o /s : Windows-10-10.0.22000-SP0 python : 3.9.16 tsai : 0.3.5 fastai : 2.7.11 fastcore : 1.5.29 torch : 1.13.0 device : 1 gpu (['NVIDIA GeForce GTX 1650']) cpu cores : 6 threads per cpu : 2 RAM : 15.85 GB GPU memory : [4.0] GB


This is my first time using GitHub issue and I'm not a English native speaker, so if my description is unclear or doesn't follow the community guidelines, please let me know,then I will modify it soon,thank you.
epdavid1 commented 1 year ago

Probably because your new data was not normalized before passing it in the model? Yours is a typical result of a non-normalized/scaled data input.

zhaosiyuan1098 commented 1 year ago

Thanks for your help! I finally found out that this problem was caused by my new dataset and I have solved it successfully~