ServiceNow / Fast-LLM

Accelerating your LLM training to full speed
https://servicenow.github.io/Fast-LLM/
Other
37 stars 5 forks source link

llama3 rope #55

Open RaymondLi0 opened 1 day ago

RaymondLi0 commented 1 day ago

โœจ Description

Closes #39

๐Ÿ” Type of change

Select all that apply:

๐Ÿ“ Changes

โœ… Checklist

Make sure the following tasks are completed before submitting the PR:

General

Dependencies and Configuration

Testing

Performance Impact

๐Ÿ“Š Performance Impact Details

If there is any impact on performance, describe it and provide benchmark results, if applicable:


๐Ÿ—’๏ธ Additional Notes

TODOs:

tscholak commented 1 day ago

Hi @RaymondLi0! Functionally this looks like what we want (pending model conversion), but are you confident (i.e. have you checked) that the forward and backward passes of hf-llama and fast-llm-llama are the same?

RaymondLi0 commented 1 day ago

Hi @RaymondLi0! Functionally this looks like what we want (pending model conversion), but are you confident (i.e. have you checked) that the forward and backward passes of hf-llama and fast-llm-llama are the same?

Haven't done that check. Do we have existing tests comparing forward/backward of fast-llm and hf-transformers? If no I can look into adding this

jlamypoirier commented 1 day ago

Haven't done that check. Do we have existing tests comparing forward/backward of fast-llm and hf-transformers? If no I can look into adding this

There is one in test_checkpoint https://github.com/ServiceNow/Fast-LLM/blob/main/tests/test_checkpoint.py#L31. It could work for this one if we added a llama3 model to the testing suite (in common.py). (It would also be a good test of conversion, etc.)