aleximmer / Laplace

Laplace approximations for Deep Learning.
https://aleximmer.github.io/Laplace
MIT License
436 stars 63 forks source link

Feature Request - Implementation for BERT #114

Closed Nikola12344 closed 2 months ago

Nikola12344 commented 1 year ago

To start with, great work so far! You made a very useful library.

I would like to implement last layer Laplace approx. into BERT. To be used by Laplace library, NN must be deterministic, and supported by library. BERT is using mask, and I think it's randomly assigned. This messes up things. I didn't find the list of models supported by Laplace library.

Is it possible to implement some version of BERT with Laplace library, and how? I used BERT Pytorch multilingual version from HuggingFace.

Here is my Stack Overflow question: SO: https://stackoverflow.com/questions/73599356/how-to-implement-laplace-posteriori-approximation-on-bert-in-pytorch/73636234#73636234

Best, Nikola Greb

runame commented 1 year ago

Hi Nikola,

Thanks for your interest in our library. At first glance the reply you got on Stack Overflow looks correct to me and the issue is not related to any randomness in the forward pass. Can you post your updated code which includes your implementation of the suggested solution?

Nikola12344 commented 1 year ago

Hi Runa,

Thank you for your quick response.

I attached the Jupyter Notebook with my debugging efforts. Please let me know what I can do better.

I'm working in a team with Gerald and Danko. Danko is our manager, and Gerald is a senior data scientist.

Gerald made some progress by changing the fit function inside Class ParametricLaplace(BaseLaplace) in source code ( line 348 - line 385) so it accepts the mask, but he still gets random results for the last layer Laplace approximation. https://github.com/AlexImmer/Laplace/blob/main/laplace/baselaplace.py

We would like to contribute to open source if we solve this issue before you.

Stack Overflow (once more if someone in CC wants to take a look): https://stackoverflow.com/questions/73599356/how-to-implement-laplace-posteriori-approximation-on-bert-in-pytorch/73636234#73636234

Best, Nikola

uto, 4. lis 2022. u 15:38 Runa Eschenhagen @.***> napisao je:

Hi Nikola,

Thanks for your interest in our library. At first glance the reply you got on Stack Overflow looks correct to me and the issue is not related to any randomness in the forward pass. Can you post your updated code which includes your implementation of the suggested solution?

— Reply to this email directly, view it on GitHub https://github.com/AlexImmer/Laplace/issues/114#issuecomment-1267016904, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKIL2H5DIYGON5W7EQYU5KLWBQXOZANCNFSM6AAAAAAQYWX2KA . You are receiving this because you authored the thread.Message ID: @.***>

wiseodd commented 4 months ago

This should be fixed by #144. That PR has been successfully used for https://github.com/wiseodd/lapeft-bayesopt; i.e. Laplace on Huggingface LLMs like LLAMA-2-7B and T5.