Closed timoderbeste closed 5 years ago
Hi!
Decoding seems to have some issues at beam > 1, because when summaries have
20 sentences, the reranking needs to do too many calculations and just gets stuck.
My solution was too only take the first 20 sentences for the summary and skip the last ones when decoding. On holiday right now, but I can share my code later if you want.
Kind regards,
Nick
Op di 30 jul. 2019 09:45 schreef Timo Wang notifications@github.com:
Hi! I am trying to summarize some text files (not from the cnn daily mail dataset) using your decode_full_model.py. When the number of files is not very big, i.e. around 1000, it was working perfectly fine. However, because I have in total around 1 million files to summarize, the decoding process got stuck at the 1320th file. I tried to restart the decode multiple times and each time it was stuck at the same file. I am wondering what could cause it. I did not modify the code but instead pre process my files so that they have the same structures as the one suggested on the website.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ChenRocks/fast_abs_rl/issues/57?email_source=notifications&email_token=AEHVK6NYJTTB5KH4YXOAXC3QCBHXRA5CNFSM4IH5EXK2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HCKIPHA, or mute the thread https://github.com/notifications/unsubscribe-auth/AEHVK6JFTKYEJP5IIBBLZK3QCBHXRANCNFSM4IH5EXKQ .
Thank you for your response!
I checked the config I was using and can confirm that my beam was set to be 1. However, it can be an issue that the documents to summarize can be long. Could you maybe suggest where I can modify the code so that only the first 20 sentences are taken for summary? I will give a it a try then.
Thanks again!
You can check out my branch here:
It is somewhere in the decoding functions that I made the changes. Cheers!
di 30 jul. 2019 17:24 schreef Timo Wang notifications@github.com:
Thank you for your response!
I checked the config I was using and can confirm that my beam was set to be 1. However, it can be an issue that the documents to summarize can be long. Could you maybe suggest where I can modify the code so that only the first 20 sentences are taken for summary? I will give a it a try then.
Thanks again!
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ChenRocks/fast_abs_rl/issues/57?email_source=notifications&email_token=AEHVK6NHWM7CN5JXGAG7SP3QCC5RVA5CNFSM4IH5EXK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3FPUWY#issuecomment-516618843, or mute the thread https://github.com/notifications/unsubscribe-auth/AEHVK6OA3FJ6VWAWDEFCJOTQCC5RVANCNFSM4IH5EXKQ .
Great! Thanks a lot!
Holen Sie sich Outlook für Androidhttps://aka.ms/ghei36
From: nick notifications@github.com Sent: Tuesday, July 30, 2019 5:27:13 PM To: ChenRocks/fast_abs_rl fast_abs_rl@noreply.github.com Cc: Timo Wang ntwang1994@gmail.com; Author author@noreply.github.com Subject: Re: [ChenRocks/fast_abs_rl] Decoding stuck at the 1320th json input (#57)
You can check out my branch here:
It is somewhere in the decoding functions that I made the changes. Cheers!
di 30 jul. 2019 17:24 schreef Timo Wang notifications@github.com:
Thank you for your response!
I checked the config I was using and can confirm that my beam was set to be 1. However, it can be an issue that the documents to summarize can be long. Could you maybe suggest where I can modify the code so that only the first 20 sentences are taken for summary? I will give a it a try then.
Thanks again!
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ChenRocks/fast_abs_rl/issues/57?email_source=notifications&email_token=AEHVK6NHWM7CN5JXGAG7SP3QCC5RVA5CNFSM4IH5EXK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3FPUWY#issuecomment-516618843, or mute the thread https://github.com/notifications/unsubscribe-auth/AEHVK6OA3FJ6VWAWDEFCJOTQCC5RVANCNFSM4IH5EXKQ .
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ChenRocks/fast_abs_rl/issues/57?email_source=notifications&email_token=ABYCOKBMK43DQAIC7H6G4ULQCC54DA5CNFSM4IH5EXK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3FPZ6Y#issuecomment-516619515, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABYCOKAA4CLSWKWUQSDXRJ3QCC54DANCNFSM4IH5EXKQ.
Thanks for the solution.
Hi! I am trying to summarize some text files (not from the cnn daily mail dataset) using your decode_full_model.py. When the number of files is not very big, i.e. around 1000, it was working perfectly fine. However, because I have in total around 1 million files to summarize, the decoding process got stuck at the 1320th file. I tried to restart the decode multiple times and each time it was stuck at the same file. You can see the outputs from the screenshot below.
I am wondering what could cause it. I did not modify the code but instead pre process my files so that they have the same structures as the one suggested on the website.