Encountering IndexError in train_stg6.py

SarvaniChalasani commented 11 months ago

File "C:\Users...\project_MSE_CNN\src\msecnn_raulkviana\useful_scripts..\custom_dataset.py", line 492, in getitem
sample = self.get_sample(entry) File "C:\Users...\project_MSE_CNN\src\msecnn_raulkviana\useful_scripts..\custom_dataset.py", line 695, in get_sample color_ch = entry[17] IndexError: list index out of range

Can you kindly help us with this error?

raulkevinviana commented 11 months ago

Hello. Looking at it the error message, the first thing that comes to my mind is: which dataset are you using? Because that class expects each sample to contain a certain amount of elements.

SarvaniChalasani commented 10 months ago

Hi sir ,we have used the same dataset you have provided in the GitHub repository.We have taken RAISE_TRAIN files from CPIV-Images dataset

raulkevinviana commented 10 months ago

Hello sir, we are contacting you as we are facing issues running the codes.
I hope you can help us.
As given in the documentation we have followed the below series of steps-
->We have downloaded RAISE_TRAIN_768x512.yuv from the dataset(Images).
->We have followed the steps given in the processsing_code.md
          - Encode_dataset.py           -create_labels.py
          -add_ctu_v3.py
          -mod_struct.py(with only dataset_utils.change_struct_no_dupl_stg2_v4(path_dir_l))
          -balance.py
->train_stg6.py
and we are getting an error.
File "C:\Users...\project_MSE_CNN\src\msecnn_raulkviana\useful_scripts..\custom_dataset.py", line 492, in getitem
sample = self.get_sample(entry)
File "C:\Users...\project_MSE_CNN\src\msecnn_raulkviana\useful_scripts..\custom_dataset.py", line 695, in get_sample
color_ch = entry[17]
IndexError: list index out of range

when we tried printing 'entry' elements,
we obtained:
[3148684.614267015, 3682847.7082909313, 0.0, 0.0, 0.0, 0.0]
[64, 0]
[64, 64]
0
tensor([[[245., 244., 244.,  ..., 251., 251., 251.],
         [243., 243., 243.,  ..., 251., 251., 251.],
         [245., 243., 245.,  ..., 251., 251., 251.],
         ...,
         [248., 249., 248.,  ..., 251., 251., 251.],
         [248., 248., 249.,  ..., 251., 251., 251.],
         [248., 248., 249.,  ..., 251., 251., 251.]]])
0

Based on the email you sent me, it's not clear which stage you're aiming to train. Since you've prepared the data for stage 2, you can't expect to use the "train_stg6.py" script without making modifications. If it's stage 6 you're targeting, instead of utilizing the "change_struct_no_dupl_stg2_v4" method, you should use the "change_struct_no_dupl_stg6_v4" method in the "Retrieve essential data" step. This method creates structures suitable for training stage 6. However, if you intend to train stage 2, you'll need to adjust the "train" method within the "train_stag6.py" to fit the labels you generated and the specific stage you're training. Furthermore, you would need to change the classes that load the data for training because it is different for each stage. If you need help with this, I can help.

SarvaniChalasani commented 10 months ago

Thank you sir for your explanation.

raulkevinviana commented 10 months ago

So if your goal is to train the stage 2, you can do the following changes in the train_stg6.py:

main():


def main():

# Initialize Model
stg1_2 = msecnn.MseCnnStg1(device=device, QP=QP).to(device)
# Remove the stages that you don't need <-------
# stg3 = msecnn.MseCnnStgX(device=device, QP=QP).to(device)
# stg4 = msecnn.MseCnnStgX(device=device, QP=QP).to(device)
# stg5 = msecnn.MseCnnStgX(device=device, QP=QP).to(device)
# stg6 = msecnn.MseCnnStgX(device=device, QP=QP).to(device)

# Create a tuple with only the stages that you need   <-------
model = (stg1_2,)   #stg3, stg4, stg5, stg6)

# Load Optimizer
optimizer = torch.optim.Adam(model[-1].parameters(), lr=learning_rate) 

# Load scheduler
lr_sch = torch.optim.lr_scheduler.ExponentialLR(optimizer, decay, last_epoch=-1)

ans = str(input('Do you want to load any existing model? Y/N \n'))
if ans == 'Y' or ans == 'y':

    ans = str(input('Do you want to load the full model (all stages in the folder) or just the last stage? Press A for all stages or L for the last stage. (Press enter afterwards) \n'))
    path = input('What\'s the path of the files you want to load the model to?')

    if ans == 'Y' or ans == 'y':
        model = train_model_utils.load_model_parameters_eval(model, path, device)

    else:
        # Change the method that is used to load the parameters to an appropriate one <-------  
        model = train_model_utils.oad_model_parameters_eval(model, path, device)

# Prepare training and testing data, Dataset and Dataloader
# Change the dataset class to the one that corresponds to stage 2 <------- 
train_data = custom_dataset.CUDatasetStg2Compl(files_path=l_path_train)
# Since the CUs for the stage 2 are always a square, you don't need any special  Sampler <------- 
# batch_sampler_train = DataLoader(train_data, batch_size)  # Batch Sampler
dataloader_train = DataLoader(train_data, num_workers=num_workers) #batch_sampler=batch_sampler_train)
test_data = custom_dataset.CUDatasetStg2Compl(files_path=l_path_test)
# batch_sampler_test = custom_dataset.SamplerStg6(test_data, batch_size) 
dataloader_test = DataLoader(test_data, num_workers=num_workers,)  # batch_sampler=batch_sampler_test)

# Load Loss function
loss_fn = msecnn.LossFunctionMSE(beta=beta)

# Train
print("Starting training...")
my_final_model, my_final_optimizer = train_test(dataloader_train, dataloader_test, model, loss_fn,
                                                optimizer, device, iterations,
                                                lr_sch)

# Print optimizer's state_dict
print("Optimizer's lr:", lr_sch.get_lr())
print("Optimizer's lr:", lr_sch.get_last_lr())
print("Optimizer's lr:", my_final_optimizer.param_groups[0]["lr"])


- train()
```python
def train(dataloader, model, loss_fn, optimizer, device):
    """!
    If batch size equal to 1 it's a Stochastic Gradiente Descent (SGD), otherwise it's a mini-batch gradient descent. 
    If the batch is the same as the number as the size of the dataset, it will be a Batch gradient Descent
    """

    # Initialize variable
    size = len(dataloader.dataset)
    size = int(size/batch_size)
    global cnt_train

    # History variables for loss
    loss_RD_lst = []
    loss_CE_lst = []
    loss_lst = []

    # Select the model you want to train, in this case it is the stage 1 and 2 <------------
    # Select model mode
    model[0].train()

    # Loop
    for batch_num, sample_batch in enumerate(dataloader):
        # Obtain data for training
        # Adapt the data that comes from the CustomDataset for the stage you want to train (check the class's function get_sample()) <---------------
        CTU = torch.reshape(sample_batch[0], shape=(-1, 1, 128, 128))
        cu_pos_stg2 = torch.reshape(sample_batch[1], shape=(-1, 2))
        cu_size_stg2 = torch.reshape(sample_batch[2], shape=(-1, 2))
        split= torch.reshape(sample_batch[3], (-1, 1))
        RDs = torch.reshape(sample_batch[4], shape=(-1, 6)) 

        # Zero your gradients for every batch!
        optimizer.zero_grad()

        # Convert type
        CUs = CTU.to(device)
        Y = split.to(device)
        RDs = RDs.to(dtype=torch.float64).to(device)
        Y = train_model_utils.one_hot_enc(torch.tensor(Y.tolist())).to(device)

        # Compute prediction
        # Since you just want to train the first and second stage, remove the not needed stages <--------------------
        # Stage 1 and 2
        pred_stg2, CUs, ap = model[0](CUs, cu_size_stg2, cu_pos_stg2)  # Pass CU through network

        # Compute the loss and its gradients
        loss, loss_CE, loss_RD = loss_fn(pred_stg2, Y, RDs)
        loss.backward()

        # Register the losses
        loss_lst.append(loss.item())
        loss_CE_lst.append(loss_CE.item())
        loss_RD_lst.append(loss_RD.item())
        cnt_train += 1

        # Adjust learning weights
        optimizer.step()

        # Print information about training
        if (batch_num+1) % 1 == 0:

            #Acc_history.append(acc)
            utils.echo("Complete: {percentage:.0%}".format(percentage=batch_num/size))

        # Save model
        if (batch_num+1) % 5 == 0:

            f_name = "last_stg_"  # File name
            for k in range(len(model)):
                # Change the path that you want save the model, in this case stage 1 and 2 <-----------------------
                train_model_utils.save_model_parameters("stg1_2_best_"+files_mod_name_stats, f_name + str(k), model[k])

    # Get mean of losses
    mean_L_RD = np.array(loss_RD_lst).mean()
    mean_L_CE = np.array(loss_CE_lst).mean()
    mean_L = np.array(loss_lst).mean()

    # Register losses per epoch
    writer.add_scalars("Losses/trainPerEpoch", {"Loss": mean_L, "Loss_CE": mean_L_CE, 
                        "Loss_RD": mean_L_RD}, t)

test():

def test(dataloader, model, loss_fn, device, loss_name):
# Initialize variables
size = len(dataloader.dataset)
size = int(size/batch_size)
# History variables
predictions = []
ground_truths = []
pred_vector = []
ground_truths_vector = []
loss_RD_lst = []
loss_CE_lst = []
loss_lst = []

global cnt_test_test
global cnt_test_train

# Select model mode
for m in model:
    m.eval()

# With no gradient descent
with torch.no_grad():

    for i_batch, sample_batch in enumerate(dataloader):

        # Obtain data for training
        # Adapt the data that comes from the CustomDataset for the stage you want to test (check the class's function get_sample()) <---------------
        CTU = torch.reshape(sample_batch[0], shape=(-1, 1, 128, 128))
        cu_pos_stg2 = torch.reshape(sample_batch[1], shape=(-1, 2))
        cu_size_stg2 = torch.reshape(sample_batch[2], shape=(-1, 2))
        split = torch.reshape(sample_batch[3], (-1, 1))
        RDs = torch.reshape(sample_batch[4], shape=(-1, 6)) 

        # Convert type
        CUs = CTU.to(device)
        Y = split.to(device)
        RDs = RDs.to(dtype=torch.float64).to(device)
        Y = train_model_utils.one_hot_enc(torch.tensor(Y.tolist())).to(device)

        # Compute prediction
        # Since you just want to train the first and second stage, remove the not needed stages <--------------------
        # Stage 1 and 2
        pred_stg2, CUs, ap = model[0](CUs, cu_size_stg2, cu_pos_stg2)  # Pass CU through network

        # Compute loss
        loss, loss_CE, loss_RD = loss_fn(pred_stg2, Y, RDs)

        if loss_name == "train":
            loss_lst.append(loss.item())
            loss_CE_lst.append(loss_CE.item())
            loss_RD_lst.append(loss_RD.item())
            cnt_test_train += 1

        else:
            # Register losses
            loss_lst.append(loss.item())
            loss_CE_lst.append(loss_CE.item())
            loss_RD_lst.append(loss_RD.item())
            cnt_test_test += 1

        # Obtain results in different format
        pred_num = train_model_utils.obtain_mode(pred)
        Y_num = train_model_utils.obtain_mode(Y)

        # Update lists
        predictions.extend(pred_num.tolist())
        ground_truths.extend(Y_num.tolist())
        ground_truths_vector.extend(Y.tolist())
        pred_vector.extend(pred.tolist())

        # Print information about training
        if (i_batch+1) % 10 == 0:
            utils.echo("Complete:{percentage:.0%}".format(percentage=i_batch / size))

# Get mean of losses
mean_L_RD = np.array(loss_RD_lst).mean()
mean_L_CE = np.array(loss_CE_lst).mean()
mean_L = np.array(loss_lst).mean()

# Register losses per epoch
writer.add_scalars("Losses/val"+loss_name+"PerEpoch", {"Loss": mean_L, "Loss_CE": mean_L_CE, 
                    "Loss_RD": mean_L_RD}, t)

return predictions, ground_truths, pred_vector, ground_truths_vector

There other sections of code that you can also change, such as the path in which you will store your "best" model and last model:

# Save best loss and best params
    if best_f1 < f1_test:
        best_f1 = f1_test  # Update best f1-score
        f_name = "best_stg_"  # Choose file name
        for k in range(len(model)):
            # Change the name accordingly <--------------------------
            train_model_utils.save_model_parameters("stg6_last_"+files_mod_name_stats, f_name + str(k), model[k])

These are the major changes that you should do. Feel free to tell me if you still have problems. Moreover, I'll update the documentation in order for this to be more clear.

SarvaniChalasani commented 10 months ago

Sure sir, thank you for your clear explanation.

SarvaniChalasani commented 8 months ago

Sir, can you help us with the VTM source code to integrate the developed model to check efficiency?

raulkevinviana commented 8 months ago

Hello. Regarding this I have extremely limited knowledge, so I can not help. This part of the work was done by a PhD student that was assisting me. So please monitor this ticket. Once I know how he modified the VTM source code I will create a tutorial. However, I doubt that he will provide me with any sort of information so soon, since he is a extremely busy guy.

Concerning what is being discussed in this ticket, is everything fine? Can I close it?

SarvaniChalasani commented 8 months ago

Yes, sir. Everything is fine. Thank you for your support and guidance.

raulkviana / MSE-CNN-Implementations

Encountering IndexError in train_stg6.py #3