-
A sample agent with following configuration fails when calling using localhost:8000/docs /run endpoint with the following error.
```
return Agent(
name="Gemini Agent",
agent_id="Ge…
-
Hi! I'm a beginner to all of this. Can someone direct me how to finetune the v3 model? I saw #99 on how to structure the dataset https://github.com/MeetKai/functionary/blob/main/tests/test_case_v2.jso…
sjay8 updated
2 months ago
-
I love that I can load extensive public domain resources directly from the internet into the sessions and add hundreds of thousands of data point. I can then run knowledge graph optimizations, as wel…
-
Hi
I'm saving the chat history to the postgresdb through data layer but when I'm doing the chat resume the history is not getting loaded but the chat title is coming up in the side bar and chat is a…
-
提问任何问题,都会一直重复自己的答案,直到达到模型的max_token;
-
### Your current environment
```text
ollecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…
-
For enforcing model to give response in json format, I am using ExLlamaV2TokenEnforcerFilter and ExLlamaV2PrefixFilter classes and appending to to filters list and passing as filters for generating ou…
-
用llama factory进行sft可以使用deepspeed zero2 微调llama3-8B的模型,但这个框架就算batch设为1,用deepspeed zero2也会报OOM。
用zero3训练会变得很慢,出现这个问题:
2 pytorch allocator cache flushes since last step. this happens when there is hi…
-
shareAI系列:
base预训练 + 直接中文SFT版:
V2版
modelscope:https://modelscope.cn/models/baicai003/Llama3-Chinese_v2/summary
-
### Feature Name
MiniCPM-v2.5
### Feature Description
Research about MiniCPM-v2.5
### Research Findings
MiniCPM-v2.5 is a Chinese language model developed by the Beijing Academy of Artificial Int…