-
# Dataset
1. Refactor the self cognition dataset to support multi-lingual QAs.
# Megatron PreTrain
1. Support more Megatron models
2. Support dataset split
# Fine-tuning
1. RAG LLM training …
-
Is there any way using multi-model llms (Open-Source) to connect with it?
-
Hi Alex! Amazing work! Well done!
It seems that we're loading the ESM tokenizer and models multiple times during the inference stage for multi-chain processing. Perhaps we could optimize this to impr…
-
-
## Data source
Private data based on 4 scenarios,a total of about 62,000 data.
## Conclusion
(1) Under the same parameters, the multi-task model will decrease by ~2%.
(2) Note, however, that this…
-
[[Open issues - help wanted!]](https://github.com/vllm-project/vllm/issues/4194#issuecomment-2102487467)
**Update [9/8] - We have finished majority of the refactoring and made extensive progress fo…
-
Hi there!
I am trying to train a 3d_fullres model, but the patch size, despite maximizing the memory occupied in one of the GPUs, is too small. Hence, I would like to try multi-GPU training but I c…
-
-
Hi Team,
I am looking forward for training a multi-output regression model for performing pattern recognition tasks. I am unable to find direct support for it, after reading through the documentatio…
-
Check AttentionStore paper and see if the performance would be good or not.
- AttentionStore: Cost-effective Attention Reuse across Multi-turn Conversations in Large Language Model Serving https://…