llm-tools / embedJs

A NodeJS RAG framework to easily work with LLMs and embeddings
https://www.npmjs.com/package/@llm-tools/embedjs
Apache License 2.0
264 stars 32 forks source link

Hugging Face Model [Issue] #11

Closed poeeain closed 7 months ago

poeeain commented 7 months ago

I try to use Huggingface model like the following, but it shows error. Probably, I set the wrong config, please help me check it. Thanks in advance.

.env file

HUGGINGFACEHUB_API_KEY={hf_key}
AZURE_OPENAI_API_KEY=
OPENAI_API_KEY=

Code :

const { RAGApplicationBuilder, HuggingFace, TextLoader } = require("@llm-tools/embedjs");

const initEmbedding = async () =>{
const ragApplication = await new RAGApplicationBuilder()
.setModel(new HuggingFace({ modelName: 'google/gemma-7b' }))
.setSearchResultCount(10)
.addLoader(new TextLoader({ text: `Tesla is an American multinational automotive and clean energy company headquartered in Austin, Texas, which designs, manufactures and sells electric vehicles, stationary battery energy storage devices from home to grid-scale, solar panels and solar shingles, and related products and services.

Tesla was incorporated in July 2003 by Martin Eberhard and Marc Tarpenning as Tesla Motors. The company's name is a tribute to inventor and electrical engineer Nikola Tesla. In February 2004 Elon Musk joined as the company's largest shareholder and in 2008 he was named CEO. In 2008, the company began production of its first car model, the Roadster sports car, followed by the Model S sedan in 2012, the Model X SUV in 2015, the Model 3 sedan in 2017, the Model Y crossover in 2020, the Tesla Semi truck in 2022 and the Cybertruck pickup truck in 2023. The Model 3 is the all-time bestselling plug-in electric car worldwide, and in June 2021 became the first electric car to sell 1 million units globally.[5] In 2023, the Model Y was the best-selling vehicle, of any kind, globally.[2]

Tesla is one of the world's most valuable companies. In October 2021, Tesla's market capitalization temporarily reached $1 trillion, the sixth company to do so in U.S. history. As of 2023, it is the world's most valuable automaker. In 2022, the company led the battery electric vehicle market, with 18% share.

Tesla has been the subject of lawsuits, government scrutiny, and journalistic criticism, stemming from allegations of whistleblower retaliation, worker rights violations, product defects, and Musk's many controversial statements.` }))
.setCache(new MemoryCache())
.setVectorDb(new LanceDb({ path: '.db' }))
.build();

// get result
console.log(await ragApplication.query('What is Tesla?'));

}

initEmbedding();

Error :

node_modules/@langchain/openai/dist/embeddings.cjs:117
throw new Error("OpenAI or Azure OpenAI API key not found");
nishantshah977 commented 7 months ago

I think the problem might be due to the environment variables AZURE_OPENAI_API_KEY= OPENAI_API_KEY= Screenshot 2024-02-24 132455

poeeain commented 7 months ago

Maybe, however, I only use HuggingFace model, is it necessary to add other API keys?

nishantshah977 commented 7 months ago

It's not necessary just use huggingface token

adhityan commented 7 months ago

Actually, there are two things here. The embedding model and the LLM. OpenAI provides both embedding models (Ada, small and large) and LLMs (GPT3.5, 4, etc).

By default the library uses OpenAI 3small for embedding and OpenAI 3.5 GPT for the LLM. For both these, it uses the same OPENAI keys (either directly OPENAI_API_KEY or the Azure keys if you are running them from Azure).

The library right now supports HuggingFace inference models (for LLM). It does not support open soruce HuggingFace embedding models. The only embedding models supported at this time are OpenAI's Ada, 3small, 3large and Cohere's embedding model.

So, as it stands today, you will need to either provide an OPENAI (/ Azure OpenAI) key OR cohere embedding key (make sure to set CohereEmbedding as the emedding model in the loader). This is regardless of the LLM you use, which can be a free HuggingFace model.

Hope this answers your question.

PS: Adding support for HuggingFace embedding models is in the plan but later down the line. Right now, I am adding support for Azure Vector databases. If you want to, you can send in a PR to add support for HuggingFace emebdding models, I will review them and we can get the feature added sooner.

adhityan commented 7 months ago

I will go ahead and close this issue for now if there are no more open questions.