EleutherAI / project-menu

See the issue board for the current status of active and prospective projects!
65 stars 4 forks source link

[Project] Are large LMs ensembles of shallow paths? #13

Closed ConnorJL closed 1 year ago

ConnorJL commented 3 years ago
jmerizia commented 3 years ago

I spoke with Connor earlier today, and I think I can give this one a try.

StellaAthena commented 3 years ago

@jmerizia Opened a repo for this project: https://github.com/EleutherAI/attention-ensemble

ConnorJL commented 2 years ago

Jacob did some good work here, but ultimately the FF layers in actual transformers ruin the abstraction as they "mix" all the different paths, and we haven't come up with any good idea of how to fix this problem. If anyone can come up with a new theoretical abstraction to generalize this analysis to full transformers, I'd love to hear it, otherwise this project is probably not possible.

StellaAthena commented 2 years ago

Jacob did some good work here, but ultimately the FF layers in actual transformers ruin the abstraction as they "mix" all the different paths, and we haven't come up with any good idea of how to fix this problem. If anyone can come up with a new theoretical abstraction to generalize this analysis to full transformers, I'd love to hear it, otherwise this project is probably not possible.

This seems like a sufficiently noteworthy negative result to merit a blog post

manuelsh commented 2 years ago

Why don't you move the project to "abandoned/dormant" in the board? Is the board active?