issues
search
microsoft
/
LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
https://aka.ms/GeneralAI
MIT License
3.71k
stars
283
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[ProTeGi]How to apply to text generation task? such as Translation
#284
Norwa9
opened
3 days ago
0
[MiniLLM] Llama weights conversion
#283
aaab8b
opened
5 days ago
1
[UPRISE]What is the use of some keys in hellaswag and copa?
#282
zhouchang123
opened
1 week ago
2
[Minillm] Using the qwen2-72b model as the teacher model for minillm training results in out of memory
#281
shhn1
opened
1 week ago
0
Exception: Current loss scale already at minimum - cannot decrease scale anymore
#280
Z-eloto
opened
1 week ago
2
What's the difference between ProTeGi and OPRO?
#279
chansonzhang
opened
2 weeks ago
0
What's the difference between ProTeGi and DSPy?
#278
chansonzhang
opened
2 weeks ago
0
Where Is Code For This Paper: "Improving Domain Adaptation through Extended-Text Reading Comprehension"
#277
EddieJ03
opened
2 weeks ago
0
[LLM_RETRIEVER]Some questions about the papar of llm_retriever .
#276
zhouchang123
opened
1 month ago
2
Details about seqkd
#275
atiehsharifi
opened
1 month ago
2
image-text model finetune
#274
dianshuoli
opened
1 month ago
0
Data Selection
#273
t1101675
closed
1 month ago
0
Bug in Logit Processor?
#272
SaeedNajafi
opened
1 month ago
1
update MiniLLM download links
#271
t1101675
closed
1 month ago
0
[MiniLLM] Resources unavailable, 404 error.
#270
Jahb
opened
1 month ago
2
Bump gitpython from 3.1.32 to 3.1.41 in /dpkd/transformers/examples/research_projects/distillation
#269
dependabot[bot]
opened
2 months ago
0
[LLM_RETRIEVER]What's python version of llm_retriever?
#268
zhouchang123
closed
1 month ago
0
[Question] sampler.py in case of args.teacher_mixed_alpha
#267
kykim0
closed
2 months ago
2
[MiniLLM]没有看出kd和seqkd的loss的区别
#266
lean-wang
closed
2 months ago
7
[MiniLLM] mismatch between formula and implementation (gradL)_long? [old_logprobs not find in paper]
#265
lancerts
closed
2 months ago
2
[MiniLLM] mismatch between formula and implementation (gradL)_long?
#264
lancerts
closed
2 months ago
2
[MiniLLM] The "processed_data.tar" data link is invalid.
#263
shhn1
closed
2 months ago
2
[UPRISE]After rereading the paper UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation,I have some questions.
#262
zhouchang123
opened
2 months ago
13
Enhancing Command Generation with AI Integration and Error Handling: A Complete Guide to…
#261
RahulVadisetty91
opened
2 months ago
1
[llm_retriever] It seems wrong in inference/gen_llm_score.py
#260
zhouchang123
closed
2 months ago
1
Bump cryptography from 41.0.2 to 43.0.1 in /minillm/transformers/examples/research_projects/decision_transformer
#259
dependabot[bot]
opened
2 months ago
0
Training code unable to find for AdaptLLM
#258
chowkamlee81
closed
1 month ago
1
[Instruction-pretrain] Questions regarding the format of general instructions mixed in the pretraining corpus
#257
Vispstar-V
closed
2 months ago
3
Bump opencv-python from 4.4.0.42 to 4.8.1.78 in /minillm/transformers/examples/research_projects/visual_bert
#256
dependabot[bot]
opened
2 months ago
0
[MiniLLM] mismatch between formula and implementation of the single-step loss?
#255
YifanHao
closed
2 months ago
2
Bump nltk from 3.7 to 3.9 in /minillm/transformers/examples/research_projects/decision_transformer
#254
dependabot[bot]
opened
3 months ago
0
About pre-training data of distilled qwen model
#253
Jim2016713
closed
2 months ago
1
Bump aiohttp from 3.8.5 to 3.10.2 in /minillm/transformers/examples/research_projects/decision_transformer
#252
dependabot[bot]
opened
3 months ago
0
Bump keras from 2.8.0 to 2.13.1 in /minillm/transformers/examples/research_projects/decision_transformer
#251
dependabot[bot]
opened
3 months ago
0
[MiniLLM] teacher generated responses `gen_answer` not used in seqKD
#250
hieuchi911
closed
2 months ago
2
Bump tensorflow from 2.8.1 to 2.12.1 in /minillm/transformers/examples/research_projects/decision_transformer
#249
dependabot[bot]
opened
3 months ago
0
Bump grpcio from 1.44.0 to 1.53.2 in /minillm/transformers/examples/research_projects/decision_transformer
#248
dependabot[bot]
opened
3 months ago
0
[uprise] reproducing the experiments in the original paper is time-consuming
#247
Cheungki
closed
3 months ago
2
Bump torch from 1.11.0 to 2.2.0 in /minillm/transformers/examples/research_projects/decision_transformer
#246
dependabot[bot]
opened
4 months ago
0
[UPRISE}When use llm to score the prompt,the connection stopped.
#245
zhouchang123
closed
1 week ago
4
[MiniLLM] llama3-instruct版本蒸馏
#244
Yjonben
opened
4 months ago
1
Bump zipp from 3.7.0 to 3.19.1 in /minillm/transformers/examples/research_projects/decision_transformer
#243
dependabot[bot]
opened
4 months ago
0
Bump certifi from 2023.7.22 to 2024.7.4 in /minillm/transformers/examples/research_projects/decision_transformer
#242
dependabot[bot]
opened
4 months ago
0
Bump transformers from 4.26.1 to 4.38.0 in /dpkd/transformers/examples/tensorflow/language-modeling-tpu
#241
dependabot[bot]
opened
4 months ago
0
Bump tqdm from 4.48.2 to 4.66.3 in /dpkd/transformers/examples/research_projects/visual_bert
#240
dependabot[bot]
opened
4 months ago
0
Bump idna from 2.8 to 3.7 in /dpkd/transformers/examples/research_projects/visual_bert
#239
dependabot[bot]
opened
4 months ago
0
[UPRISE]When training the uprise,a problem happened.
#238
zhouchang123
closed
4 months ago
17
[UPRISE]CUDA out of memory. Tried to allocate 3.25 GiB. GPU
#237
zhouchang123
closed
4 months ago
4
训练过程中CUDA out of memory
#236
Yjonben
closed
1 month ago
10
A problem occur when execute the first procedure.
#235
zhouchang123
closed
4 months ago
13
Next