Closed Riroaki closed 4 years ago
I checked the code and found this line is responsible for the OOM problem: https://github.com/marcotcr/checklist/blob/64a810a3586f276521dda73d4223c1e1e5ae7151/checklist/text_generation.py#L168
For my part, the propagation process should be done in batches, which means splitting the to_pred
tensor and send them into the model one by one and merge the result at last; also the size of a batch should be configurable in APIs, which might add some workload. Maybe I'll make a PR later...
Beam size does not matter if there is only one mask, but I forgot that the same word may appear multiple times in a sentence, at which point beam size does matter (I'm guessing that happened in your example). I just set beam_size=100
in 832b5ba, but I should probably revisit this at some point if it still causes trouble. Please let me know if that doesn't fix it for you.
Thanks! This is the best kind of issue, where someone identifies a problem and tells me the solution right away : )
Thanks for your reply! In my case, the problem seems a bit more complicated:
beam_size=10000
, often the search returned nothing.
So it might be hard to find a proper threshold, or the searching process may be improved? Looking forward to a further discussion:)Are you using synonyms
? Often the search will not return anything because there are no synonyms in wordnet, or because the synonyms in there do not fit the sentence.
Yep, I was referring to the searching part based on contexts. 😂 However, the wordnet part works fine in my case.
Upon using perturb functions and replace words with synonyms, this line causes CUDA OOM error: https://github.com/marcotcr/checklist/blob/64a810a3586f276521dda73d4223c1e1e5ae7151/checklist/text_generation.py#L270
The beam size is unbounded. Is it possible to make it configurable by users when calling
antonyms/synonyms
API so that the memory cost is more controllable? BTW thank you for such great work!