microsoft LMOps issues - Githubissues

microsoft / LMOps

General technology for enabling AI capabilities w/ LLMs and MLLMs

https://aka.ms/GeneralAI

MIT License

3.71k stars 283 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[ProTeGi]How to apply to text generation task? such as Translation

#284 Norwa9 opened 3 days ago
0
[MiniLLM] Llama weights conversion

#283 aaab8b opened 5 days ago
1
[UPRISE]What is the use of some keys in hellaswag and copa?

#282 zhouchang123 opened 1 week ago
2
[Minillm] Using the qwen2-72b model as the teacher model for minillm training results in out of memory

#281 shhn1 opened 1 week ago
0
Exception: Current loss scale already at minimum - cannot decrease scale anymore

#280 Z-eloto opened 1 week ago
2
What's the difference between ProTeGi and OPRO?

#279 chansonzhang opened 2 weeks ago
0
What's the difference between ProTeGi and DSPy?

#278 chansonzhang opened 2 weeks ago
0
Where Is Code For This Paper: "Improving Domain Adaptation through Extended-Text Reading Comprehension"

#277 EddieJ03 opened 2 weeks ago
0
[LLM_RETRIEVER]Some questions about the papar of llm_retriever .

#276 zhouchang123 opened 1 month ago
2
Details about seqkd

#275 atiehsharifi opened 1 month ago
2
image-text model finetune

#274 dianshuoli opened 1 month ago
0
Data Selection

#273 t1101675 closed 1 month ago
0
Bug in Logit Processor?

#272 SaeedNajafi opened 1 month ago
1
update MiniLLM download links

#271 t1101675 closed 1 month ago
0
[MiniLLM] Resources unavailable, 404 error.

#270 Jahb opened 1 month ago
2
Bump gitpython from 3.1.32 to 3.1.41 in /dpkd/transformers/examples/research_projects/distillation

#269 dependabot[bot] opened 2 months ago
0
[LLM_RETRIEVER]What's python version of llm_retriever?

#268 zhouchang123 closed 1 month ago
0
[Question] sampler.py in case of args.teacher_mixed_alpha

#267 kykim0 closed 2 months ago
2
[MiniLLM]没有看出kd和seqkd的loss的区别

#266 lean-wang closed 2 months ago
7
[MiniLLM] mismatch between formula and implementation (gradL)_long? [old_logprobs not find in paper]

#265 lancerts closed 2 months ago
2
[MiniLLM] mismatch between formula and implementation (gradL)_long?

#264 lancerts closed 2 months ago
2
[MiniLLM] The "processed_data.tar" data link is invalid.

#263 shhn1 closed 2 months ago
2
[UPRISE]After rereading the paper UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation,I have some questions.

#262 zhouchang123 opened 2 months ago
13
Enhancing Command Generation with AI Integration and Error Handling: A Complete Guide to…

#261 RahulVadisetty91 opened 2 months ago
1
[llm_retriever] It seems wrong in inference/gen_llm_score.py

#260 zhouchang123 closed 2 months ago
1
Bump cryptography from 41.0.2 to 43.0.1 in /minillm/transformers/examples/research_projects/decision_transformer

#259 dependabot[bot] opened 2 months ago
0
Training code unable to find for AdaptLLM

#258 chowkamlee81 closed 1 month ago
1
[Instruction-pretrain] Questions regarding the format of general instructions mixed in the pretraining corpus

#257 Vispstar-V closed 2 months ago
3
Bump opencv-python from 4.4.0.42 to 4.8.1.78 in /minillm/transformers/examples/research_projects/visual_bert

#256 dependabot[bot] opened 2 months ago
0
[MiniLLM] mismatch between formula and implementation of the single-step loss？

#255 YifanHao closed 2 months ago
2
Bump nltk from 3.7 to 3.9 in /minillm/transformers/examples/research_projects/decision_transformer

#254 dependabot[bot] opened 3 months ago
0
About pre-training data of distilled qwen model

#253 Jim2016713 closed 2 months ago
1
Bump aiohttp from 3.8.5 to 3.10.2 in /minillm/transformers/examples/research_projects/decision_transformer

#252 dependabot[bot] opened 3 months ago
0
Bump keras from 2.8.0 to 2.13.1 in /minillm/transformers/examples/research_projects/decision_transformer

#251 dependabot[bot] opened 3 months ago
0
[MiniLLM] teacher generated responses `gen_answer` not used in seqKD

#250 hieuchi911 closed 2 months ago
2
Bump tensorflow from 2.8.1 to 2.12.1 in /minillm/transformers/examples/research_projects/decision_transformer

#249 dependabot[bot] opened 3 months ago
0
Bump grpcio from 1.44.0 to 1.53.2 in /minillm/transformers/examples/research_projects/decision_transformer

#248 dependabot[bot] opened 3 months ago
0
[uprise] reproducing the experiments in the original paper is time-consuming

#247 Cheungki closed 3 months ago
2
Bump torch from 1.11.0 to 2.2.0 in /minillm/transformers/examples/research_projects/decision_transformer

#246 dependabot[bot] opened 4 months ago
0
[UPRISE}When use llm to score the prompt,the connection stopped.

#245 zhouchang123 closed 1 week ago
4
[MiniLLM] llama3-instruct版本蒸馏

#244 Yjonben opened 4 months ago
1
Bump zipp from 3.7.0 to 3.19.1 in /minillm/transformers/examples/research_projects/decision_transformer

#243 dependabot[bot] opened 4 months ago
0
Bump certifi from 2023.7.22 to 2024.7.4 in /minillm/transformers/examples/research_projects/decision_transformer

#242 dependabot[bot] opened 4 months ago
0
Bump transformers from 4.26.1 to 4.38.0 in /dpkd/transformers/examples/tensorflow/language-modeling-tpu

#241 dependabot[bot] opened 4 months ago
0
Bump tqdm from 4.48.2 to 4.66.3 in /dpkd/transformers/examples/research_projects/visual_bert

#240 dependabot[bot] opened 4 months ago
0
Bump idna from 2.8 to 3.7 in /dpkd/transformers/examples/research_projects/visual_bert

#239 dependabot[bot] opened 4 months ago
0
[UPRISE]When training the uprise,a problem happened.

#238 zhouchang123 closed 4 months ago
17
[UPRISE]CUDA out of memory. Tried to allocate 3.25 GiB. GPU

#237 zhouchang123 closed 4 months ago
4
训练过程中CUDA out of memory

#236 Yjonben closed 1 month ago
10
A problem occur when execute the first procedure.

#235 zhouchang123 closed 4 months ago
13