-
While running the updated code, I encountered an issue as follows:
![1](https://github.com/kyegomez/BitNet/assets/103629060/2a5f1cb9-3d8c-471a-bd7c-04eb0df6ba8b)
I would greatly appreciate any gui…
-
NVIDIA-SMI 470.129.06 Driver Version: 470.129.06 CUDA Version: 12.3
python 3.10.14
ubuntu 22.04
pip install typing_extensions-4.11.0-py3-none-any.whl
pip install bitblas-0.0.1.dev5-py3-no…
-
**Describe the bug**
When I try to replace BitLinear layer into a HF model (say Llama2-7b-chat), the size is same for both though. Shouldn't size after replacing with BitLinear layer be reduced?
##…
-
**Describe the bug**
readme.md contains a link to https://drive.google.com/file/d/1gBuZRFBqMV3cVD902LXA_hmZl4e0dLyY/view which reports "Sorry, the file you have requested does not exist."
**To Rep…
-
Thank you for your innovative work, can you provide a distributed training example?
then can quickly reproduct and verify thesis work。
## Upvote & Fund
- We're using [Polar.sh](https://polar.sh/…
-
Thank you for sharing this incredible work!
I speculate that it's an issue of library versions, but I'm receiving the following error when attempting to run unmodified train.py:
`RuntimeError: The…
-
self.ff sequential modules could have None, which is not callable, if post_act_ln is False.
[suggenstion]
ff_layers = [project_in]
if post_act_ln:
ff_layers.append(…
-
In bit_transformer.py:
```python
class Transformer(nn.Module):
def forward(self, x: Tensor, *args, **kwargs) -> Tensor:
for attn, ffn in zip(self.layers, self.ffn_layers):
…
-
Thanks for your quick implementation! I was reading through `bitnet/bitbnet_b158.py` and just had a short question.
In your implementation of `quantize_weights` you use the same procedure as outli…
-
**Describe the bug**
A clear and concise description of what the bug is and what the main root cause error is. Test very thoroughly before submitting.
**To Reproduce**
Steps to reproduce the beha…