-
did anyone made this to work?
i tested half dozen of models.. none of them actually worked.
## Upvote & Fund
- We're using [Polar.sh](https://polar.sh/kyegomez) so you can upvote and help fund t…
-
While running the updated code, I encountered an issue as follows:
![1](https://github.com/kyegomez/BitNet/assets/103629060/2a5f1cb9-3d8c-471a-bd7c-04eb0df6ba8b)
I would greatly appreciate any gui…
-
NVIDIA-SMI 470.129.06 Driver Version: 470.129.06 CUDA Version: 12.3
python 3.10.14
ubuntu 22.04
pip install typing_extensions-4.11.0-py3-none-any.whl
pip install bitblas-0.0.1.dev5-py3-no…
-
**Describe the bug**
When I try to replace BitLinear layer into a HF model (say Llama2-7b-chat), the size is same for both though. Shouldn't size after replacing with BitLinear layer be reduced?
##…
-
**Describe the bug**
readme.md contains a link to https://drive.google.com/file/d/1gBuZRFBqMV3cVD902LXA_hmZl4e0dLyY/view which reports "Sorry, the file you have requested does not exist."
**To Rep…
-
Thank you for your innovative work, can you provide a distributed training example?
then can quickly reproduct and verify thesis work。
## Upvote & Fund
- We're using [Polar.sh](https://polar.sh/…
-
self.ff sequential modules could have None, which is not callable, if post_act_ln is False.
[suggenstion]
ff_layers = [project_in]
if post_act_ln:
ff_layers.append(…
-
Thank you for sharing this incredible work!
I speculate that it's an issue of library versions, but I'm receiving the following error when attempting to run unmodified train.py:
`RuntimeError: The…
-
In bit_transformer.py:
```python
class Transformer(nn.Module):
def forward(self, x: Tensor, *args, **kwargs) -> Tensor:
for attn, ffn in zip(self.layers, self.ffn_layers):
…
-
**Describe the bug**
After 5300 iteraitons loss near 2.7, is it still supposed to spit out near giberish?
**To Reproduce**
Running on CPU, macbookkair M2, omitting the model.cuda() line
*…