Closed AnthonyFang623 closed 2 years ago
Hi @AnthonyFang623, There's a way to do it, although it is not as direct as I'd like it to be. This is a code snippet you can use to try this approach. Let's say you have already trained a model:
X, y, splits = get_UCR_data('LSST', split_data=False)
tfms = [None, TSClassification()]
batch_tfms = TSStandardize(by_sample=True)
dls = get_ts_dls(X, y, splits=splits, tfms=tfms, batch_tfms=batch_tfms)
learn = ts_learner(dls, InceptionTimePlus, metrics=accuracy, cbs=[ShowGraph()])
learn.fit_one_cycle(10, 1e-2)
You can use this code to extract the top loss indices:
interp = Interpretation.from_learner(learn)
valid_top_losses, valid_idxs = interp.top_losses(9)
valid_top_losses, valid_idxs
Bear in mind the valid_idxs are referenced to the validation splits:
highest_loss_input_idxs = splits[1][valid_idxs]
sel_X, sel_y = X[highest_loss_input_idxs], y[highest_loss_input_idxs]
new_dl = learn.dls.new_dl(sel_X, sel_y)
new_dl.show_batch()
Hi @oguiza , This is really helpful, thank you very much! And here is another question. I use the new version of ROCKET method that you developed, below are my codes
from tsai.all import *
X = np.load('mydata.npy')
y = np.load('mylabel.npy')
X2d = X[:]
X3d = to3d(X2d)
splits = get_splits(y, valid_size=.2, stratify=True, random_state=23, shuffle=True)
tfms = [None, [Categorize()]]
batch_tfms = [TSStandardize(by_sample=True)]
dls = get_ts_dls(X3d, y, splits=splits, tfms=tfms, drop_last=False, shuffle_train=False, batch_tfms=batch_tfms, bs=10_000)
model = build_ts_model(ROCKET, dls=dls)
X_train, y_train = create_rocket_features(dls.train, model)
X_valid, y_valid = create_rocket_features(dls.valid, model)
and my data works perfectly with RidgeClassifierCV
from sklearn.linear_model import RidgeClassifierCV
ridge = RidgeClassifierCV(alphas=np.logspace(-8, 8, 17), normalize=True)
ridge.fit(X_train, y_train)
print(f'alpha: {ridge.alpha_:.2E} train: {ridge.score(X_train, y_train):.5f} valid: {ridge.score(X_valid, y_valid):.5f}')
but I want to output some results figures like losses, accuracy, confusion matrix, like in the Inceptiontime model, and recall_score, precision_score, f1_score, like in SVM or any other linear classifier. But I did'n find solutions in the tutorials when using RidgeClassifier. Are those ideas possible?
Hi @AnthonyFang623, No, it's not possible. When you use a sklearn classifier, there's no fastai learner. So you'll need to use sklearn functionality to do what you want to do. sklearn website contains lots of examples.
Hi @oguiza , I see where I get it wrong. Thank you! But I find in tutorial 02, when you chose Fastai classifier head as the classifier, you can output figures like in the Inceptiontime, but I don't understand the code here
def lin_zero_init(layer):
if isinstance(layer, nn.Linear):
nn.init.constant_(layer.weight.data, 0.)
if layer.bias is not None: nn.init.constant_(layer.bias.data, 0.)
model = create_mlp_head(dls.vars, dls.c, dls.len)
model.apply(lin_zero_init)
learn = Learner(dls, model, metrics=accuracy, cbs=ShowGraph()) # find lr curves and suggest the best
I assume lin_zero_init(layer)
is to instantiate a layer with wights and bias?
And aboutmodel = create_mlp_head
, what does mlp_head mean? A name of classifier or a method of generating features? And in the python script it calls, I find I could change it to create_fc_head
, creat_conv_head
or some other options, I did't find the related documents in the project.
There's a way to do it, although it is not as direct as I'd like it to be
Maybe it's helpful to wrap that code snippet as a plot_top_losses
function, kind of what fastai does?
Hi @vrodriguezf, I agree it may be useful. I'll add it to the list of ideas.
I believe this should be implemented as plot_best_losses and plot_worst_losses, with the option to select one or multiple classes.
IIRC, fastai has just one plot_top_losses, and one argument to choose whether you want top best or top worst
Hi, I've just added 2 new learner methods: top_losses and plot_top_learner. You just need to pass X and y and select k (number of losses) and largest (True for highest or Flase for lowest). I think these address the enhancement requested before. I've tested it and it seems to work well. cc: @AnthonyFang623 , @vrodriguezf
learn.top_losses(X[splits[1]], y[splits[1]], k=9, largest=True)
learn.plot_top_losses(X[splits[1]], y[splits[1]], k=9, largest=True)
Amazing Ignacio! I like it even more than the fastai version, since it's patched directly in the learner and does not need of a separate Interpretation object :)
I am wondering if the net could record the wrong predictions of the dataset? So maybe I can find a pattern from the wrong files and adjust my method of preprocessing data.