raulkviana / MSE-CNN-Implementations

Code database with the implementation of MSE-CNN, from the paper "DeepQTMT: A Deep Learning Approach for Fast QTMT-based CU Partition of Intra-mode VVC".
MIT License
12 stars 2 forks source link

Encountering IndexError in train_stg6.py #3

Closed SarvaniChalasani closed 8 months ago

SarvaniChalasani commented 11 months ago

File "C:\Users...\project_MSE_CNN\src\msecnn_raulkviana\useful_scripts..\custom_dataset.py", line 492, in getitem
sample = self.get_sample(entry) File "C:\Users...\project_MSE_CNN\src\msecnn_raulkviana\useful_scripts..\custom_dataset.py", line 695, in get_sample color_ch = entry[17] IndexError: list index out of range

Can you kindly help us with this error?

raulkevinviana commented 11 months ago

Hello. Looking at it the error message, the first thing that comes to my mind is: which dataset are you using? Because that class expects each sample to contain a certain amount of elements.

SarvaniChalasani commented 10 months ago

Hi sir ,we have used the same dataset you have provided in the GitHub repository.We have taken RAISE_TRAIN files from CPIV-Images dataset

raulkevinviana commented 10 months ago
Hello sir, we are contacting you as we are facing issues running the codes.
I hope you can help us.
As given in the documentation we have followed the below series of steps-
->We have downloaded RAISE_TRAIN_768x512.yuv from the dataset(Images).
->We have followed the steps given in the processsing_code.md
          - Encode_dataset.py           -create_labels.py
          -add_ctu_v3.py
          -mod_struct.py(with only dataset_utils.change_struct_no_dupl_stg2_v4(path_dir_l))
          -balance.py
->train_stg6.py
and we are getting an error.
File "C:\Users...\project_MSE_CNN\src\msecnn_raulkviana\useful_scripts..\custom_dataset.py", line 492, in getitem
sample = self.get_sample(entry)
File "C:\Users...\project_MSE_CNN\src\msecnn_raulkviana\useful_scripts..\custom_dataset.py", line 695, in get_sample
color_ch = entry[17]
IndexError: list index out of range

when we tried printing 'entry' elements,
we obtained:
[3148684.614267015, 3682847.7082909313, 0.0, 0.0, 0.0, 0.0]
[64, 0]
[64, 64]
0
tensor([[[245., 244., 244.,  ..., 251., 251., 251.],
         [243., 243., 243.,  ..., 251., 251., 251.],
         [245., 243., 245.,  ..., 251., 251., 251.],
         ...,
         [248., 249., 248.,  ..., 251., 251., 251.],
         [248., 248., 249.,  ..., 251., 251., 251.],
         [248., 248., 249.,  ..., 251., 251., 251.]]])
0

Based on the email you sent me, it's not clear which stage you're aiming to train. Since you've prepared the data for stage 2, you can't expect to use the "train_stg6.py" script without making modifications. If it's stage 6 you're targeting, instead of utilizing the "change_struct_no_dupl_stg2_v4" method, you should use the "change_struct_no_dupl_stg6_v4" method in the "Retrieve essential data" step. This method creates structures suitable for training stage 6. However, if you intend to train stage 2, you'll need to adjust the "train" method within the "train_stag6.py" to fit the labels you generated and the specific stage you're training. Furthermore, you would need to change the classes that load the data for training because it is different for each stage. If you need help with this, I can help.

SarvaniChalasani commented 10 months ago

Thank you sir for your explanation.

raulkevinviana commented 10 months ago

So if your goal is to train the stage 2, you can do the following changes in the train_stg6.py:


- train()
```python
def train(dataloader, model, loss_fn, optimizer, device):
    """!
    If batch size equal to 1 it's a Stochastic Gradiente Descent (SGD), otherwise it's a mini-batch gradient descent. 
    If the batch is the same as the number as the size of the dataset, it will be a Batch gradient Descent
    """

    # Initialize variable
    size = len(dataloader.dataset)
    size = int(size/batch_size)
    global cnt_train

    # History variables for loss
    loss_RD_lst = []
    loss_CE_lst = []
    loss_lst = []

    # Select the model you want to train, in this case it is the stage 1 and 2 <------------
    # Select model mode
    model[0].train()

    # Loop
    for batch_num, sample_batch in enumerate(dataloader):
        # Obtain data for training
        # Adapt the data that comes from the CustomDataset for the stage you want to train (check the class's function get_sample()) <---------------
        CTU = torch.reshape(sample_batch[0], shape=(-1, 1, 128, 128))
        cu_pos_stg2 = torch.reshape(sample_batch[1], shape=(-1, 2))
        cu_size_stg2 = torch.reshape(sample_batch[2], shape=(-1, 2))
        split= torch.reshape(sample_batch[3], (-1, 1))
        RDs = torch.reshape(sample_batch[4], shape=(-1, 6)) 

        # Zero your gradients for every batch!
        optimizer.zero_grad()

        # Convert type
        CUs = CTU.to(device)
        Y = split.to(device)
        RDs = RDs.to(dtype=torch.float64).to(device)
        Y = train_model_utils.one_hot_enc(torch.tensor(Y.tolist())).to(device)

        # Compute prediction
        # Since you just want to train the first and second stage, remove the not needed stages <--------------------
        # Stage 1 and 2
        pred_stg2, CUs, ap = model[0](CUs, cu_size_stg2, cu_pos_stg2)  # Pass CU through network

        # Compute the loss and its gradients
        loss, loss_CE, loss_RD = loss_fn(pred_stg2, Y, RDs)
        loss.backward()

        # Register the losses
        loss_lst.append(loss.item())
        loss_CE_lst.append(loss_CE.item())
        loss_RD_lst.append(loss_RD.item())
        cnt_train += 1

        # Adjust learning weights
        optimizer.step()

        # Print information about training
        if (batch_num+1) % 1 == 0:

            #Acc_history.append(acc)
            utils.echo("Complete: {percentage:.0%}".format(percentage=batch_num/size))

        # Save model
        if (batch_num+1) % 5 == 0:

            f_name = "last_stg_"  # File name
            for k in range(len(model)):
                # Change the path that you want save the model, in this case stage 1 and 2 <-----------------------
                train_model_utils.save_model_parameters("stg1_2_best_"+files_mod_name_stats, f_name + str(k), model[k])

    # Get mean of losses
    mean_L_RD = np.array(loss_RD_lst).mean()
    mean_L_CE = np.array(loss_CE_lst).mean()
    mean_L = np.array(loss_lst).mean()

    # Register losses per epoch
    writer.add_scalars("Losses/trainPerEpoch", {"Loss": mean_L, "Loss_CE": mean_L_CE, 
                        "Loss_RD": mean_L_RD}, t)

These are the major changes that you should do. Feel free to tell me if you still have problems. Moreover, I'll update the documentation in order for this to be more clear.

SarvaniChalasani commented 10 months ago

Sure sir, thank you for your clear explanation.

SarvaniChalasani commented 8 months ago

Sir, can you help us with the VTM source code to integrate the developed model to check efficiency?

raulkevinviana commented 8 months ago

Hello. Regarding this I have extremely limited knowledge, so I can not help. This part of the work was done by a PhD student that was assisting me. So please monitor this ticket. Once I know how he modified the VTM source code I will create a tutorial. However, I doubt that he will provide me with any sort of information so soon, since he is a extremely busy guy.

Concerning what is being discussed in this ticket, is everything fine? Can I close it?

SarvaniChalasani commented 8 months ago

Yes, sir. Everything is fine. Thank you for your support and guidance.