Closed Muennighoff closed 4 years ago
Sorry I did not make it clear in the paper. I still use the 15% probability and hard code it in the code (since I did not plan to tune this probability).
On Sat, 25 Jul 2020 at 01:41, Muennighoff notifications@github.com wrote:
Hi, in your paper you mention that task-specific pre-training is also using masked-language modelling similar to task-agnostic pre-training. However, I cannot find any mask probability for the task-specific pre-training .json files. Why is no probability specified & did you use the same 15% probability as for task agnostic pre-training?
Sorry perhaps I'm missing something here -- Thanks for the help!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/uclanlp/visualbert/issues/13, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALAYVCSV77MOT2OSHRZJTFLR5KLEZANCNFSM4PHLKEYA .
Great thanks, closing the issue!
Hi, in your paper you mention that task-specific pre-training is also using masked-language modelling similar to task-agnostic pre-training. However, I cannot find any mask probability for the task-specific pre-training .json files. Why is no probability specified & did you use the same 15% probability as for task agnostic pre-training?
Sorry perhaps I'm missing something here -- Thanks for the help!