Decoder output - Githubissues

atulkum / pointer_summarizer

pytorch implementation of "Get To The Point: Summarization with Pointer-Generator Networks"

Apache License 2.0

906 stars 242 forks source link

Decoder output #6

Closed ankitnit closed 4 years ago

ankitnit commented 6 years ago

I am getting same output for all the batches

atulkum commented 6 years ago

You mean you are getting similar output for example in a single batch. During decode the batch size is same as beam size and all similar output in s single batch is as expected. You can compare with the output which I got here

ankitnit commented 6 years ago

No, I have trained the model with 410000 iter with is_coverage=true, and i am getting same summary for each batch when i am running decode.py

atulkum commented 6 years ago

You might not want to train initially with is_coverage=true, that will make training unstable. You can try is_coverage=false and train and then compare the result.

Can you send me some example output, I want to see if it is a random string?

ankitnit commented 6 years ago

i tried with is_coverage=false and run 175000 iter and also i am comparing the result , it generate same summary for every batch

previous result with is_coverage=True https://drive.google.com/open?id=1vjMsFpoQxdSQCukLxnRNK_ghjJD3ENnG

Yifan-Song793 commented 4 years ago

Just check your input. The batcher doesn't expect binary file input, but string. If your inputs are binary files, just decode it to string.

rainsher commented 4 years ago

have you solved the problem? @ankitnit

rainsher commented 4 years ago

As @DominicSong said, I solved my problem by convert the artcile and abstract from bytes to string in about line 217, batcher.py . I run in python3. article = str(article,encoding='utf-8') abstract = str(abstract,encoding='utf-8') abstract_sentences = [sent.strip() for sent in data.abstract2sents(abstract)]

EmmittXu commented 4 years ago

You mean you are getting similar output for example in a single batch. During decode the batch size is same as beam size and all similar output in s single batch is as expected. You can compare with the output which I got here

Hi, I don't understand why beam size and batch size are equal in current decode. As I play with it and set them equal the code works fine, otherwise it throws dimension mismatch error. I believe these two are independent and there might be a better decode implementation?

atulkum commented 4 years ago

This is done to take advantage of GPU batch processing. If the beam size is B you need to run B rnns in parallel. Using a batch for beam makes decoding code cleaner and computationally efficient on GPU. If you want to increase beam size increase the batch size.