Open stevennt opened 2 weeks ago
Indexing files:
src/khoj/routers/indexer.py
Initial Data: how to code them down instead of manually doing it on the frontend? http://localhost:42110/server/admin/
Accepted files: src/khoj/interface/web/chat.html
API to answer chat:
// Generate backend API URL to execute query
let url = /api/chat?q=${encodeURIComponent(query)}&n=${resultsCount}&client=web&stream=true&conversation_id=${conversationID}®ion=${region}&city=${city}&country=${countryName}&timezone=${timezone}
;
// Call specified ABN API
let response = await fetch(url);
let rawResponse = "";
let references = null;
Maybe loading the indication here:
Database: src/khoj/database/models/init.py
Init: maybe change here:
src/khoj/utils/initialization.py
how / when are the models downloaded?
Seems that it will download from HuggingFace at runtime.
Oh they have reranking
Compute Embeddings, Load Pre-computed embeddings: src/khoj/search_type/text_search.py
src/khoj/processor/conversation/prompts.py many prompts
Go to the OpenAI settings in the server admin settings to add an OpenAI processor conversation config. This is where you set your API key and server API base URL. The API base URL is optional - it's only relevant if you're using another OpenAI-compatible proxy server. Go over to configure your chat model options. Set the chat-model field to a supported chat model1 of your choice. For example, you can specify gpt-4-turbo-preview if you're using OpenAI. Make sure to set the model-type field to OpenAI. The tokenizer and max-prompt-size fields are optional. Set them only if you're sure of the tokenizer or token limit for the model you're using. Contact us if you're unsure what to do here. Configure Offline Chat No need to setup a conversation processor config! Go over to configure your chat model options. Set the chat-model field to a supported chat model1 of your choice. For example, we recommend NousResearch/Hermes-2-Pro-Mistral-7B-GGUF, but any gguf model on huggingface should work. Make sure to set the model-type to Offline. Do not set openai config. The tokenizer and max-prompt-size fields are optional. Set them only when using a non-standard model (i.e not mistral, gpt or llama2 model) when you know the token limit.
Successfully configure Khoj with OpenAI:
src/khoj/database/models/init.py
src/khoj/migrations/migrate_processor_config_openai.py
The URL should be without /chat, because Khoj appends that automatically. If I add, it will be duplicate.
BadRequestError: Error code:
myaiabnkhoj-server-1 | 400 - {'error': {'message':
myaiabnkhoj-server-1 | 'response_format` does not
myaiabnkhoj-server-1 | support streaming', 'type':
myaiabnkhoj-server-1 | 'invalid_request_error'}}
PROMPTS: src/khoj/processor/conversation/prompts.py
src/khoj/configure.py
https://docs.khoj.dev/get-started/setup/
The tokenizer and max-prompt-size fields are optional. Set them only if you're sure of the tokenizer or token limit for the model you're using. Contact us if you're unsure what to do here.
src/khoj/processor/conversation/utils.py
def truncate_messages( messages: list[ChatMessage], max_prompt_size, model_name: str, loaded_model: Optional[Llama] = None, tokenizer_name=None, ) -> list[ChatMessage]: """Truncate messages to fit within max prompt size supported by model"""
default_tokenizer = "hf-internal-testing/llama-tokenizer"
try:
if loaded_model:
encoder = loaded_model.tokenizer()
elif model_name.startswith("gpt-"):
encoder = tiktoken.encoding_for_model(model_name)
elif tokenizer_name:
if tokenizer_name in state.pretrained_tokenizers:
encoder = state.pretrained_tokenizers[tokenizer_name]
else:
encoder = AutoTokenizer.from_pretrained(tokenizer_name)
state.pretrained_tokenizers[tokenizer_name] = encoder
else:
encoder = download_model(model_name).tokenizer()
except:
if default_tokenizer in state.pretrained_tokenizers:
encoder = state.pretrained_tokenizers[default_tokenizer]
else:
encoder = AutoTokenizer.from_pretrained(default_tokenizer)
state.pretrained_tokenizers[default_tokenizer] = encoder
logger.warning(
f"Fallback to default chat model tokenizer: {tokenizer_name}.\nConfigure tokenizer for unsupported model: {model_name} in Khoj settings to improve context stuffing."
)
Lets try this:
google-bert/bert-base-uncased
https://huggingface.co/docs/transformers/en/main_classes/tokenizer
openai-community/gpt2
Oh, in the code: default_tokenizer = "hf-internal-testing/llama-tokenizer"
Upload Files: src/khoj/interface/web/chat.html
All the things (uploads, etc.) are implemented in API, so I can just play with the APIs
src/khoj/routers/api_chat.py
Sync/index data: Simply edit this config file and let Khoj Desktop do the job.
{
"files": [
{
"path": "/home/thanhson/Downloads/RFP#2024-Amgen-01 Biding App Upgrade.pdf"
}
],
"folders": [],
"khojToken": "kk-yHlnpZ4zKsw-ocgn9_WxUPRkgl4Fa3cECmNACl4XmVA",
"hostURL": "https://app.khoj.dev",
"lastSync": []
}
~
The backup seems to work. But where does it store?
@indexer. @auth_router. @web_client. @subscription_router. @notion_router. @api_chat. @api_agents.
maybe change this has_documents to initialize with initial documents:
has_documents
Embeddings: src/khoj/processor/embeddings.py
Text Search: src/khoj/search_type/text_search.py
I created my own embeddings and search at ABNScripts
Prompts, nice: /home/thanhson/Workspace/myai.abn.khoj/src/khoj/processor/conversation/prompts.py from langchain.prompts import PromptTemplate
personality = PromptTemplate.from_template( """ You are ABNCopilot, a smart, inquisitive and helpful personal assistant. Use your general knowledge and past conversation with the user as context to inform your responses. You were created by AbnAsia.org. with the following capabilities:
\\(
and \\)
$$
, \\[
and before closing $$
, \\]
Note: More information about you, the company or ABN apps for download can be found at https://abnasia.org. Today is {current_date} in UTC. """.strip() )
custom_personality = PromptTemplate.from_template( """ You are {name}, an Ai agent from ABN Asia. Use your general knowledge and past conversation with the user as context to inform your responses.
https://docs.khoj.dev/get-started/setup
Where is the FastAPI initiated / called ?
What are the endpoints I can use to call from outside?
I want to use its capability as backend for my other tasks, such as filling out RFPs.