muslehal / xLSTMTime

xLSTMTime for time series forecasting
MIT License
82 stars 11 forks source link

Run Configs #5

Open mauricekraus opened 1 month ago

mauricekraus commented 1 month ago

Would you mind sharing your run configs?

With random guessing of lr, if used revin, scheduler, num epochs etc. i could hardly reproduce your reported result.

Thanks

muslehal commented 1 month ago

Yes, I used revin and random of lr. number of epochs is 100 epochs. loss function MEA during training you can start with small datasets like ettm1 or ettm2 .

mauricekraus commented 1 month ago

Didn't you capture what hyperparameters you used for each dataset? Random lr is really "broad". 100 Epochs is a lot especially for larger learnrates, did you employ early stopping? What kind of xLSTM block (s or m) did you use for which dataset?

Thanks

muslehal commented 1 month ago

Didn't you capture what hyperparameters you used for each dataset? Random lr is really "broad". 100 Epochs is a lot especially for larger learnrates, did you employ early stopping? What kind of xLSTM block (s or m) did you use for which dataset?

Thanks

The main file contains all configurations such as learning rate (lr), epochs, and other parameters.

We trained our code on 12 datasets, setting the number of epochs to 100 without using early stopping. We used sLSTM for small datasets like ETTm and ETTh, and mLSTM for large datasets like electricity and traffic.

You can find more details in our paper: https://arxiv.org/pdf/2407.10240.

mauricekraus commented 1 month ago

thank you for your answer.

The main file contains all configurations such as learning rate (lr), epochs, and other parameters.

Does this mean you were using the defaults:

parser.add_argument('--batch_size', type=int, default=64    , help='batch size')

parser.add_argument('--n_epochs', type=int, default=100, help='number of training epochs')
parser.add_argument('--lr', type=float, default=1e-3, help='learning rate')

for all your different runs?

I'm not asking what kind of hp we can tune, I'm asking for something like: e.g.

ds: Weather
batch_size: 128
lr: 2e-4

Your manuscript unfortunately does not include more info.

Anyway, thank you for your response!

muslehal commented 1 month ago

thank you for your answer.

The main file contains all configurations such as learning rate (lr), epochs, and other parameters.

Does this mean you were using the defaults:

parser.add_argument('--batch_size', type=int, default=64    , help='batch size')

parser.add_argument('--n_epochs', type=int, default=100, help='number of training epochs')
parser.add_argument('--lr', type=float, default=1e-3, help='learning rate')

for all your different runs?

I'm not asking what kind of hp we can tune, I'm asking for something like: e.g.

ds: Weather
batch_size: 128
lr: 2e-4

Your manuscript unfortunately does not include more info.

Anyway, thank you for your response!

Yes, I used defaults for all datasets

about rl we use function to find the learning rate:

def find_lr():

get dataloader

dls = get_dls(args)    
model = get_model(dls.vars, args)

# get loss
#loss_func = torch.nn.MSELoss(reduction='mean')
loss_func = torch.nn.L1Loss(reduction='mean')
#loss_func=combined_loss
# get callbacks
cbs = [RevInCB(dls.vars)] if args.revin else []
#cbs += [PatchCB(patch_len=args.patch_len, stride=args.stride)]
# define learner
learn = Learner(dls,model,  loss_func , cbs=cbs  )  #cbs=cbs                      
# fit the data to the model
return learn.lr_finder()

def lr_finder(self, start_lr=1e-7, end_lr=10, num_iter=100, step_mode='exp', show_plot=True, suggestion='valley'):                
    """
    find the learning rate
    """
    n_epochs = num_iter//len(self.dls.train) + 1
    # indicator of lr_finder method is applied
    self.run_finder = True
    # add LRFinderCB to callback list and will remove later
    cb = LRFinderCB(start_lr, end_lr, num_iter, step_mode, suggestion=suggestion)                
    # fit           
    self.fit(n_epochs=n_epochs, cbs=cb, do_valid=False)        
    # should remove LRFinderCB callback after fitting                
    self.remove_callback(cb)        
    self.run_finder = False        
    if show_plot: cb.plot_lr_find()
    if suggestion: return cb.suggested_lr  
chuong-dang commented 1 month ago

Where exactly do you state the usage of either mLSTM or sLSTM in the code? E.g. when setting dset = traffic, the code still loads and probably uses sLSTM for me