Loss increases during the traning

AmmarKamoona commented 5 years ago

I am trying to do training but the loss function increases rather than decrease

I have attached the following training part __code start___ for cl_idx, video_id in enumerate(dataset.train_videos):

        # Run the train video 
        dataset.train(video_id)
        loader = DataLoader(dataset, collate_fn=dataset.collate_fn)

        # Build score containers
        #sample_llk = np.zeros(shape=(len(loader) + t - 1,))
        #sample_rec = np.zeros(shape=(len(loader) + t - 1,))
        ##uploading the ground truth
        #sample_y = dataset.load_test_sequence_gt(video_id)
        for i, (x, y) in tqdm(enumerate(loader), desc=f'Computing scores for {dataset}'):
            optimizer.zero_grad()
            x = x.to('cuda')
            # Forward pass, get our logits then backward pass, then update weights
            x_r, z, z_dist = model(x)
             # Calculate the joint loss 
            loss=criterion(x, x_r, z, z_dist)
            #print(loss)
            ## do backward and the update
            loss.backward()
            optimizer.step()
            running_loss += loss.item()  
            print (running_loss)

DavideA commented 5 years ago

Hi,

what optimizer are you using? Secondly, I see you are printing the running loss (the sum of the loss of the batches so far). Are you sure the loss is actually increasing?

AmmarKamoona commented 5 years ago

Hi There, I am using Adam optimizer with learning rate set to=0.001. epochs is set to 30. the loss function decrease and then increase

the loss is= tensor(14281.7119, device='cuda:0', grad_fn=) Computing scores for ShanghaiTech (video id = 01_019): 94it [06:52, 4.37s/it] the loss is= tensor(14346.3477, device='cuda:0', grad_fn=) Computing scores for ShanghaiTech (video id = 01_019): 95it [06:56, 4.38s/it] the loss is= tensor(15669.5996, device='cuda:0', grad_fn=) Computing scores for ShanghaiTech (video id = 01_019): 96it [07:00, 4.38s/it] the loss is= tensor(15653.5596, device='cuda:0', grad_fn=) Computing scores for ShanghaiTech (video id = 01_019): 97it [07:05, 4.38s/it] the loss is= tensor(15492.5547, device='cuda:0', grad_fn=) Computing scores for ShanghaiTech (video id = 01_019): 98it [07:09, 4.36s/it] the loss is= tensor(16204.8232, device='cuda:0', grad_fn=) Computing scores for ShanghaiTech (video id = 01_019): 99it [07:13, 4.37s/it] the loss is= tensor(15876.1279, device='cuda:0', grad_fn=) Computing scores for ShanghaiTech (video id = 01_019): 100it [07:18, 4.37s/it] the loss is= tensor(16464.8418, device='cuda:0', grad_fn=)

weibienao commented 5 years ago

@AmmarKamoona HI, can you share the complete code for training in Shanghaitech dataset? best wishes

MStumpp commented 4 years ago

@AmmarKamoona Have you been successful with the training? We just started experimenting with the code.

aimagelab / novelty-detection

Loss increases during the traning #10