Open chris-ha458 opened 1 year ago
Also, the formatter I use, changes the layout a lot so I had to manually modify the code. What formatter does this repo use?
@CodiumAI-Agent /review
๐ฏ Main theme: Adding an argument to directly set ff_inner_dim
๐ Type of PR: Enhancement
๐งช Relevant tests added: No
โจ Focused PR: Yes, the PR is focused as it has a clear and coherent title and description, and all PR code diff changes are properly derived from the title and description.
๐ Security concerns: No, the changes made in this PR do not introduce any obvious security concerns.
๐ก General PR suggestions: The PR is generally well-written and the changes are clear. However, it would be beneficial to include tests to ensure the new functionality works as expected. Additionally, it would be helpful to update the function's docstring to include the new parameter.
Tag me in a comment '@CodiumAI-Agent' and add one of the following commands: /review - Request a review of the latest update to the PR. /describe - Modify the PR title and description based on the contents of the PR. /improve - Suggest improvements to the code in the PR. These will be provided as pull request comments, ready to commit. /ask
- Pose a question about the PR.
In NVIDIA nvidia/GPT-2B-001, a very PaLM like model is implemented.
However, instead of a ffn multiplier like
ffn_mult
theffn_hidden_size
(comparable toffn_inner_dim
of this codebase) is directly set as 5440.This translates to a
ffn_mult
of2.65625
. However, trying this in this codebase does not work.The error
So I implemented a way to directly set the
ffn_inner_dim
please take a look!