issues
search
kyegomez
/
BitNet
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
https://discord.gg/qUtxnK2NMf
MIT License
1.69k
stars
155
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.12.2
#70
dependabot[bot]
opened
1 week ago
0
Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.11.0
#69
dependabot[bot]
closed
1 week ago
1
I wonder what hardware conditions (GPU) the code uses, and why the loss value has been above 5.2 after running the train.py file, and the validation generated unreadable incomprehensible content.
#68
Dayun0925
opened
2 weeks ago
0
CUDA Optimization
#67
simulanics
opened
4 weeks ago
0
Can it be used in Huggingface models such as stable diffusion and text to image or LLM models
#66
libai-lab
opened
4 weeks ago
0
Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.10.3
#65
dependabot[bot]
closed
2 weeks ago
1
Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.10.2
#64
dependabot[bot]
closed
1 month ago
1
Question: embeddings 3bits?
#63
telamon
opened
1 month ago
0
Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.10.1
#62
dependabot[bot]
closed
1 month ago
1
Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.10.0
#61
dependabot[bot]
closed
2 months ago
1
[BUG] cant compile the cuda
#60
huynhducloi00
closed
1 month ago
2
Fix: Weight quantization sign should be the last operation
#59
jmbrito01
closed
1 month ago
1
Create a javascript version that can run in a web browser?
#58
flatsiedatsie
closed
2 months ago
2
NanoGPT sample
#57
izaxon
closed
2 months ago
1
Bump pypa/gh-action-pypi-publish from 1.8.14 to 1.9.0
#56
dependabot[bot]
closed
4 months ago
0
BitNet model performs wrose than Base Transformer
#55
johanssontan
closed
1 month ago
2
added if else statement to handle post_act_ln
#54
Hiromasa-H
closed
6 months ago
1
[BUG] BitFeedForward(post_act_ln=False) results in a TypeError
#53
Hiromasa-H
closed
6 months ago
3
Expected BitLinear weight to be 1 or -1
#52
sanjeev-bhandari
closed
1 month ago
4
what is the purpose of detach here?
#51
Weitian-Wang-Bosch
closed
6 months ago
2
Fix error in bitlinear algorithm
#50
Mrw33554432
closed
4 months ago
1
[BUG] Bitnet Example Bug
#49
sneilan
closed
5 months ago
6
Consider techniques from official training paper
#48
EwoutH
closed
5 months ago
2
fix grouping in bitlinear.py
#47
Jiangxg
closed
8 months ago
0
1.58bit algorithm implement recommend
#46
princepride
closed
5 months ago
2
Encountering Size Mismatch Error in Updated Code
#45
anonymousA123
closed
5 months ago
3
Bump pypa/gh-action-pypi-publish from 1.8.12 to 1.8.14
#44
dependabot[bot]
closed
8 months ago
0
[BUG] NoneType in sequential module in bit_ffn
#43
jayUyang
closed
6 months ago
1
[BUG] bitlinear fix
#42
jayUyang
closed
5 months ago
5
is this actually working?
#41
fblgit
closed
4 months ago
11
Issue with model size after replacing BitLinear layer into a HF model (say Llama2-7b-chat)[BUG]
#40
mriganktiwari
closed
5 months ago
3
Revert "Jp"
#39
kyegomez
closed
8 months ago
0
Google Drive Link to model weights is broken
#38
SinanAkkoyun
closed
8 months ago
1
Fixed shape of beta and gamma for proper broadcasting
#37
dariocazzani
closed
8 months ago
0
Requesting a Text-to-Text translation example
#36
TMammadov
closed
6 months ago
1
The output of BitLinear is quite abnormal
#35
Jiangxg
closed
8 months ago
6
ImportError: cannot import name 'BitLinear15bs' from 'bitnet.bitbnet_b158'[BUG]
#34
Bobby-youngking
closed
8 months ago
6
Update bitlinear.py
#33
ramonpeter
closed
8 months ago
3
[BUG] residual connection wrong?
#32
qianlong0502
closed
8 months ago
1
Bump pypa/gh-action-pypi-publish from 1.8.11 to 1.8.12
#31
dependabot[bot]
closed
8 months ago
0
Update bitlinear.py
#30
ramonpeter
closed
8 months ago
2
Jp
#29
Sunwood-ai-labs
closed
8 months ago
1
fix inference bug
#28
shi3z
closed
5 months ago
1
does not have support for mistral, gemma, etc and generate error [BUG] ?
#27
NickyDark1
closed
4 months ago
5
Parts of the BitLinear code doesn't match paper (before bit1.58)
#26
qqqllppp
closed
6 months ago
2
Question about weight quantization methodology memory savings
#25
nnethercott
closed
6 months ago
1
[BUG]multi-head attention is noop for BITLINEAR
#24
Bsdnbo
closed
8 months ago
1
[BUG] Loss drops, model still produces gibberish?
#23
MichelNivard
closed
5 months ago
6
where to download bitnet model ?
#22
dibu28
closed
8 months ago
4
About 'replace_hf.py'
#21
chyoob
closed
4 months ago
3
Next