JRC1995 / Abstractive-Summarization

Implementation of abstractive summarization using LSTM in the encoder-decoder architecture with local attention.
MIT License
167 stars 59 forks source link

ZeroDivisionError #16

Open JamesTV1996 opened 3 years ago

JamesTV1996 commented 3 years ago

Trying your method for a different dataset and I am getting a ZeroDivisionError for the Training and Validation Section. I assume that something is not loading properly because there should be no zero values. Here is the code:

`import pickle import random

with tf.Session() as sess: # Start Tensorflow Session display_step = 100 patience = 5

 load = input("\nLoad checkpoint? y/n: ")
 print("")
 saver = tf.train.Saver()

 if load.lower() == 'y':

     print('Loading pre-trained weights for the model...')

     saver.restore(sess, r'C:\Users\james\Desktop\Title Generation - SENG 6245\Dataset250K.csv')
     sess.run(tf.global_variables())
     sess.run(tf.tables_initializer())

     with open(r'C:\Users\james\Desktop\Title Generation - SENG 6245\Dataset250K.csv', 'rb') as fp:
         train_data = pickle.load(fp)

     covered_epochs = train_data['covered_epochs']
     best_loss = train_data['best_loss']
     impatience = 0

     print('\nRESTORATION COMPLETE\n')

 else:
     best_loss = 2**30
     impatience = 0
     covered_epochs = 0

     init = tf.global_variables_initializer()
     sess.run(init)
     sess.run(tf.tables_initializer())

 epoch=0
 while (epoch+covered_epochs)<epochs:

     print("\n\nSTARTING TRAINING\n\n")

     batches_indices = [i for i in range(0, len(train_batches_text))]
     random.shuffle(batches_indices)

     total_train_acc = 0
     total_train_loss = 0

     for i in range(0, len(train_batches_text)):

         j = int(batches_indices[i])

         cost,prediction,\
             acc, _ = sess.run([cross_entropy,
                                outputs,
                                accuracy,
                                train_op],
                               feed_dict={tf_text: train_batches_text[j],
                                          tf_embd: embd,
                                          tf_summary: train_batches_summary[j],
                                          tf_true_summary_len: train_batches_true_summary_len[j],
                                          tf_train: True})

         total_train_acc += acc
         total_train_loss += cost

         if i % display_step == 0:
             print("Iter "+str(i)+", Cost= " +
                   "{:.3f}".format(cost)+", Acc = " +
                   "{:.2f}%".format(acc*100))

         if i % 500 == 0:

             idx = random.randint(0,len(train_batches_text[j])-1)

             text = " ".join([idx2vocab.get(vec,"<UNK>") for vec in train_batches_text[j][idx]])
             predicted_summary = [idx2vocab.get(vec,"<UNK>") for vec in prediction[idx]]
             actual_summary = [idx2vocab.get(vec,"<UNK>") for vec in train_batches_summary[j][idx]]

             print("\nSample Text\n")
             print(text)
             print("\nSample Predicted Summary\n")
             for word in predicted_summary:
                 if word == '<EOS>':
                     break
                 else:
                     print(word,end=" ")
             print("\n\nSample Actual Summary\n")
             for word in actual_summary:
                 if word == '<EOS>':
                     break
                 else:
                     print(word,end=" ")
             print("\n\n")

     print("\n\nSTARTING VALIDATION\n\n")

     total_val_loss=0
     total_val_acc=0

     for i in range(0, len(val_batches_text)):

         if i%100==0:
             print("Validating data # {}".format(i))

         cost, prediction,\
             acc = sess.run([cross_entropy,
                             outputs,
                             accuracy],
                               feed_dict={tf_text: val_batches_text[i],
                                          tf_embd: embd,
                                          tf_summary: val_batches_summary[i],
                                          tf_true_summary_len: val_batches_true_summary_len[i],
                                          tf_train: False})

         total_val_loss += cost
         total_val_acc += acc

    #Issue Starts Here  
     try:
         avg_val_loss = total_val_loss/len(val_batches_text) 
     except ZeroDivisionError: 
         avg_val_loss = 0

     print("\n\nEpoch: {}\n\n".format(epoch+covered_epochs))
     print("Average Training Loss: {:.3f}".format(total_train_loss/len(train_batches_text)))
     print("Average Training Accuracy: {:.2f}".format(100*total_train_acc/len(train_batches_text)))
     print("Average Validation Loss: {:.3f}".format(avg_val_loss))
     print("Average Validation Accuracy: {:.2f}".format(100*total_val_acc/len(val_batches_text)))

     if (avg_val_loss < best_loss):
         best_loss = avg_val_loss
         save_data={'best_loss':best_loss,'covered_epochs':covered_epochs+epoch+1}
         impatience=0
         with open('Model_Backup/Seq2seq_summarization.pkl', 'wb') as fp:
             pickle.dump(save_data, fp)
         saver.save(sess, 'Model_Backup/Seq2seq_summarization.ckpt')
         print("\nModel saved\n")

     else:
         impatience+=1

     if impatience > patience:
           break

     epoch+=1`

I can get rid of the error with exception handling but I was wondering if you had and idea of why it's not working in the first place.

JRC1995 commented 3 years ago

I haven't touched Tensorflow in a while. But wouldn't this be for loading a checkpoint?

saver.restore(sess, r'C:\Users\james\Desktop\Title Generation - SENG 6245\Dataset250K.csv')

It doesn't seem like normal Tensorflow checkpoint.

for i in range(0, len(val_batches_text)):

This is the first mention of val_batches_text that I found in the code. The issue seems to be that val_batches_text is completely empty that which your validation dataset is empty. Which would mean that the issue is probably somewhere outside this code snippet wherever you are preparing this val_batches_text.

Where are you loading the datasets btw?

JamesTV1996 commented 3 years ago

I loaded a dataset that I cleaned from my own pc. I first ran my dataset through the preprocessing program you have posted as well. For my project, I was trying to adapt your implementation so that I can generate a title based off the description of the given problem. I assume this would work for this application. The dataset I used was originally downloaded from the StackOverflow dataset found on Google Cloud API.

JRC1995 commented 3 years ago

It should work, but the main error seems to be that len(val_batches_text) is 0. That means the source of the bug is in the code snippet where val_batches_text is being created. For some reason no data is being loaded. If you are using my pre-processing, it has some heavy filters. Is it possible that no data is passing through the filters when you are you preprocessing your dataset?

vasanthaganesh commented 2 years ago

I've tried what you said in the last comment but the data set is not loading and it return the error as the zero division. We are using your pre-processing, so please help me to reduce the heavy filter.

JRC1995 commented 2 years ago

Probably you have to figure out what max length/min length are suitable for your dataset: https://github.com/JRC1995/Abstractive-Summarization/blob/master/Data_Pre-Processing.ipynb You have to probably change them: "text_max_len = 500 text_min_len = 25 summary_max_len = 30"

vasanthaganesh commented 2 years ago

image

I have changed the max length/min length values as you said, but it gives the error, how can I rectify this error. I have used only the dataset reviews.csv with 50000 lines.