kyegomez BitNet issues - Githubissues

kyegomez / BitNet

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch

https://discord.gg/qUtxnK2NMf

MIT License

1.69k stars 155 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.12.2

#70 dependabot[bot] opened 1 week ago
0
Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.11.0

#69 dependabot[bot] closed 1 week ago
1
I wonder what hardware conditions (GPU) the code uses, and why the loss value has been above 5.2 after running the train.py file, and the validation generated unreadable incomprehensible content.

#68 Dayun0925 opened 2 weeks ago
0
CUDA Optimization

#67 simulanics opened 4 weeks ago
0
Can it be used in Huggingface models such as stable diffusion and text to image or LLM models

#66 libai-lab opened 4 weeks ago
0
Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.10.3

#65 dependabot[bot] closed 2 weeks ago
1
Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.10.2

#64 dependabot[bot] closed 1 month ago
1
Question: embeddings 3bits?

#63 telamon opened 1 month ago
0
Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.10.1

#62 dependabot[bot] closed 1 month ago
1
Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.10.0

#61 dependabot[bot] closed 2 months ago
1
[BUG] cant compile the cuda

#60 huynhducloi00 closed 1 month ago
2
Fix: Weight quantization sign should be the last operation

#59 jmbrito01 closed 1 month ago
1
Create a javascript version that can run in a web browser?

#58 flatsiedatsie closed 2 months ago
2
NanoGPT sample

#57 izaxon closed 2 months ago
1
Bump pypa/gh-action-pypi-publish from 1.8.14 to 1.9.0

#56 dependabot[bot] closed 4 months ago
0
BitNet model performs wrose than Base Transformer

#55 johanssontan closed 1 month ago
2
added if else statement to handle post_act_ln

#54 Hiromasa-H closed 6 months ago
1
[BUG] BitFeedForward(post_act_ln=False) results in a TypeError

#53 Hiromasa-H closed 6 months ago
3
Expected BitLinear weight to be 1 or -1

#52 sanjeev-bhandari closed 1 month ago
4
what is the purpose of detach here?

#51 Weitian-Wang-Bosch closed 6 months ago
2
Fix error in bitlinear algorithm

#50 Mrw33554432 closed 4 months ago
1
[BUG] Bitnet Example Bug

#49 sneilan closed 5 months ago
6
Consider techniques from official training paper

#48 EwoutH closed 5 months ago
2
fix grouping in bitlinear.py

#47 Jiangxg closed 8 months ago
0
1.58bit algorithm implement recommend

#46 princepride closed 5 months ago
2
Encountering Size Mismatch Error in Updated Code

#45 anonymousA123 closed 5 months ago
3
Bump pypa/gh-action-pypi-publish from 1.8.12 to 1.8.14

#44 dependabot[bot] closed 8 months ago
0
[BUG] NoneType in sequential module in bit_ffn

#43 jayUyang closed 6 months ago
1
[BUG] bitlinear fix

#42 jayUyang closed 5 months ago
5
is this actually working?

#41 fblgit closed 4 months ago
11
Issue with model size after replacing BitLinear layer into a HF model (say Llama2-7b-chat)[BUG]

#40 mriganktiwari closed 5 months ago
3
Revert "Jp"

#39 kyegomez closed 8 months ago
0
Google Drive Link to model weights is broken

#38 SinanAkkoyun closed 8 months ago
1
Fixed shape of beta and gamma for proper broadcasting

#37 dariocazzani closed 8 months ago
0
Requesting a Text-to-Text translation example

#36 TMammadov closed 6 months ago
1
The output of BitLinear is quite abnormal

#35 Jiangxg closed 8 months ago
6
ImportError: cannot import name 'BitLinear15bs' from 'bitnet.bitbnet_b158'[BUG]

#34 Bobby-youngking closed 8 months ago
6
Update bitlinear.py

#33 ramonpeter closed 8 months ago
3
[BUG] residual connection wrong?

#32 qianlong0502 closed 8 months ago
1
Bump pypa/gh-action-pypi-publish from 1.8.11 to 1.8.12

#31 dependabot[bot] closed 8 months ago
0
Update bitlinear.py

#30 ramonpeter closed 8 months ago
2
Jp

#29 Sunwood-ai-labs closed 8 months ago
1
fix inference bug

#28 shi3z closed 5 months ago
1
does not have support for mistral, gemma, etc and generate error [BUG] ?

#27 NickyDark1 closed 4 months ago
5
Parts of the BitLinear code doesn't match paper (before bit1.58)

#26 qqqllppp closed 6 months ago
2
Question about weight quantization methodology memory savings

#25 nnethercott closed 6 months ago
1
[BUG]multi-head attention is noop for BITLINEAR

#24 Bsdnbo closed 8 months ago
1
[BUG] Loss drops, model still produces gibberish?

#23 MichelNivard closed 5 months ago
6
where to download bitnet model ?

#22 dibu28 closed 8 months ago
4
About 'replace_hf.py'

#21 chyoob closed 4 months ago
3