microsoft / deep-language-networks

We view Large Language Models as stochastic language layers in a network, where the learnable parameters are the natural language prompts at each layer. We stack two such layers, feeding the output of one layer to the next. We call the stacked architecture a Deep Language Network - DLN
MIT License
91 stars 13 forks source link

question on setup self-hosted models (vLLM) #44

Closed JingtongSu closed 8 months ago

JingtongSu commented 8 months ago

Hi, thanks for your great work!

I'm new to DLN, and I'm trying to use LLaMa2 models to replace the GPT series reported in the paper as the forward/backward models. I followed the setup procedure in the README file, namely to export the huggingface path to the LLaMa2 tokenizer, and used the script for two-layer joint end-to-end training. Unfortunately, I got openai.error.APIConnectionError: Error communicating with OpenAI and the training procedure was broken.

I investigated the code a bit and found the "generate" function of the VLLM class involves querying the openai.Completion.acreate function/API. Is that the expected behavior when I switch to self-hosted models? I suspect there should be a dependency on vLLM, but I did not find one.

Thanks in advance if you could help me with this :)!

JingtongSu commented 8 months ago

The script in this repo requires manually initiating a vLLM with the openai-compatible server beforehand! I will close this issue.

matheper commented 8 months ago

Hi @JingtongSu,

You are right. DLN does not manage the serving of models. The instructions describe how to connect to an LLM that is already hosted using vLLM. I apologize for any confusion; we are updating the documentation to better reflect this.

You can follow this instruction on how to set up an OpenAI compatible server using vLLM

Once your vLLM server is operational, you can refer to it as detailed in the DLN readme.

JingtongSu commented 8 months ago

Hi @matheper, thanks for your quick reply! I have already figured that out and made my own DLN running. Thanks for taking the time of making the code public again :)