Closed dan-homebrew closed 10 months ago
Archive @dan-jan's original comment because i need to input my product specs on top for subtask to work on github:
Why not /files
approach?
Storage: /threads /thread-1 /files > my.pdf /thread-2 Future iteration: we symlink so files are not duped. :plus:add a design that shows when models do not support RAG
Right panel: In Assistant section Under category called "Tools" Have a checked in [x] checkbox for File Retrieval In the future this section will have web search and other tools
Similar to OpenAi's createGPT flow
Eng spec
There are 3 scenarios:
tools/retrieval
off => Chat normally.tools/retrieval
on and upload file as reference (parsable PDF atm) => Ingestion phasetools/retrieval
on and query => Query phaseCommunication layers:
Model
(nitro extension/ openai extension) currently uses event based processing: on
/ emit
on
/ emit
as wellextensions/assistant-extension/src
├── @types
│ └── global.d.ts
├── index.ts
└── node
├── index.ts
└── tools
└── retrieval
└── index.ts
jan/assistants/jan/assistant.json
as follow:
{
"avatar": "",
"id": "jan",
"object": "assistant",
"created_at": 1705549969445,
"name": "Jan",
"description": "A default assistant that can use all downloaded models",
"model": "*",
"instructions": "",
"tools": [
{
"type": "retrieval",
"enabled": true,
"settings": {}
}
],
"file_ids": []
}
Tools of choice:
HNSW
binding in nodeWhat to do next even after this
Questions and Answers:
Where are we being opinionated about and WHY, i.e. our choice of hsnw, langchain, no llamaindex
=> Answer:
langchain
and llama_index
at the moment is an opinionated choice that Hiro made because of:vdb
, pre-processing steps
(e.g Text splitter) that we do not want to re-invent the wheellangchain.js
is more actively developed than llama_index TS
at the time we are developing.hsnw
is an opinionated option too as it's the most lightweight and highly compatible version that can be embedded in any OS/ CPU of choicesWhat abstractions need to happen in the future, to allow for a bring ur own vdb situation => Answer:
yes, absolutely
, that's why I choose to use langchain.js
to abstract the interface.Any "hacky" solutions employed to get things to work for now => Answer
Impact on user disk / Jan Folder / resource hogging => Answer:
Where are eng specs? https://github.com/janhq/jan/issues/1076#issuecomment-1899553830
Will it be available via the local api server?
=> Answer: Yes, but this one we have not thought through at the moment. However, this one will be designed similar to OpenAI GPTs runs
How are we chunking?
TextSplitter
with text only
is Chunking
and Overlap
. We set it by default at a fixed number but will let user to configure in the settings (thread level)How is llm map-reducing across similar vectors? Is that configurable by the user? => Answer
text
-> embedding
-> similaritySearch
(top-k) -> rerank. If the user uses a different embeddings layer (model A) for doc ingestion vs user queries (model B); our current approach seems hyper opinionated.
retrieval
, and LLM for text-generation
.Changing model in mid-thread
likely way
is to split these models into 2 models, in which the embedding model does not always change. 1 way is adding https://github.com/FFengIll/embedding.cpp for serving sentence transformer
, bge
or even write it as node-gyp to use inside Jan alone.@louis-jan point on framework layer:
.tar.gz
is 100MB) => Should be reused.
===> retrieval extension
TODO:
@alan
--- Something just work at the moment --- Improved version
Objectives
Leads
User Stories
In Scope
As a User, I want to upload text files to the chat:
As a User, I want to view the uploaded file's content:
As a User, I want to ask questions related to the uploaded file:
As a User, I want to receive responses based on file-specific queries:
As a User, I understand the limitation of multiple file uploads:
Out-of-Scope
Design Wireframes
Figma link: https://www.figma.com/file/ytn1nRZ17FUmJHTlhmZB9f/Jan-App?type=design&node-id=783-43738&mode=design&t=7KYGjHy7F1RvqEip-4
Engineering & Architecture
In Scope
.pdf
,.docx
Out-of-Scope
Tasklist
Event.on
andEvent.emit
)retrieval
settings in UI for user to changeResources
https://www.chatpdf.com/c/vzHhtas3uQVZDK9ZGglaw
Out of scope