dan-homebrew commented 1 week ago

Overview

We need to document what current Jan "data structures" are:

v0.5.3 (e.g. Model Data Folder)

We also need to define what our future state looks like:

v0.6
v0.7 (SQLite database?)

Decisions to make

App shell to have "migration framework" to handle changes in data persistence?
Most of Jan's Data Structures should be managed by Cortex Platform longer term (i.e. through inference-cortex-extension)

Linked Issues

This is a superset of cortex.cpp's data structures
https://github.com/janhq/cortex/issues/1040
1654
https://github.com/janhq/jan/issues/2995
https://github.com/janhq/jan/issues/3310
https://github.com/janhq/jan/issues/3476

louis-jan commented 1 week ago

Here is the current Jan Data Folder structure:

jan 📁

assistants 📁: Contains Jan-supported assistants (currently only jan). New copied assistants do not take effect. This folder will be created if it does not exist when the assistant-extension is loaded.
- jan 📁
- assistant.json 📄:
  - id (string): The default is jan, but this is not used anywhere in the codebase. Currently, jan is the only active assistant and is accessed via index 0 in the assistants array.
  - avatar (string): Not used anywhere in the codebase.
  - object (string): Not used in the codebase. This field is OpenAI-compatible in API responses.
  - created_at: The creation timestamp, not yet referenced in the codebase.
  - name: Assistant display name. Default value: "Jan", but this is ignored by the UI components, which are hard-coded to Jan.
  - description: Assistant description, not yet referenced in the codebase. Default: "A default assistant that can use all downloaded models."
  - model: The model used by the assistant. The default value is *, meaning all models are available.
  - instructions: Intended as the default instructions for new threads, but not functioning since version 0.5.3 (possibly due to the Save instructions for new threads feature implementation).
  - file_ids: OpenAI-compatible field, not referenced in the codebase.
  - tools: Available tools for the assistant. Only one tool, retrieval, has been created and supported so far.
  - type: The tool's type. The default tool is retrieval, not referenced in the codebase.
  - enabled: Determines if the tool is enabled by default when creating a new thread. Currently, the logic is incorrect, and the tool is always enabled. A recent fix has been proposed but not merged.
  - useTimeWeightedRetriever: Determines whether the timeWeightedRetrieval feature is activated. This feature was implemented by an external contributor (naming convention issue).
  - settings: Settings for the tool. The following are the default settings for the retrieval tool:
    - top_k: Default value is 2.
    - chunk_size: Default value is 1024.
    - chunk_overlap: Default value is 64.
    - retrieval_template: Default: "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say you don't know; don't try to make up an answer. ---------------- CONTEXT: {CONTEXT} ---------------- QUESTION: {QUESTION} ---------------- Helpful Answer:"
extensions 📁: Created on app launch (App level) if not already present. All extensions (compressed tgz) will be extracted here if they are not installed.
- @janhq 📁: Root folder for the author's extensions. In cases where multiple extensions have an organization-level package name (e.g., @janhq/monitoring-extension). If no organization is defined, it becomes an extension folder (e.g., example-extension in package.json).
- extensions.json 📄: A catalog of all installed extensions and their metadata, retrieved from each extension's package.json. This file helps avoid looping through all extension folders, which is costly. Considered as deprecated soon, as manual extension copying may cause the file to become outdated (Changes to the extension core code are needed).
logs 📁: Created when the monitoring-extension is loaded.
- app.log 📄: Logs from the app, extensions, and cortex, prefixed to indicate the source:
- [APP]: App logs
- [CORTEX]: Cortex model load and inference logs
- [SERVER]: API server logs
- [SPECS]: Hardware information logs
models 📁: Created when the models-extension is loaded.
- [model_id] 📁: Model folder, named to match the model's ID, as we support nested model folder lookups. For example, model.json may be located in a nested folder such as models/anthropic/claude-3.5-sonnet. The app uses the model's ID to trace the original model folder path during loading. There is a known issue where the model's ID does not always match the folder path.
- model.json 📄:
  - sources: Contains model files, used during downloading:
  - filename: The filename persisted after download, which should match llama_model_path if defined. Otherwise, the filename is used as llama_model_path for GGUF models (llama.cpp engine).
  - url: The download URL for the model file.
  - id: The model ID described above, used for inference requests and model folder lookups.
  - object: OpenAI-compatible model object field.
  - name: Display name for the model (in model hub, model selection dropdown, and "my model" section).
  - version: Used during migrations. Newer versions of model.json from extensions will overwrite older user data folder versions.
  - description: Displayed in the model hub.
  - format: GGUF or API, indicating whether it is a GGUF model or a remote model (API is mandatory and referenced in the codebase, especially in the model hub to filter remote models).
  - engine: Determines the engine/provider used for model load/unload or inference.
  - settings: Model load parameters:
  - llama_model_path: GGUF model filename.
  - prompt_template: Template used for parsing prompts.
  - ctx_len: Context length setting.
  - ngl: Number of GPU layers. This value is declared in the GGUF model file and varies by model.
  - embedding: Indicates if the model can be used as a text embedding model.
  - cpu_threads: Number of CPU threads for loading the model.
  - mmproj: MMPROJ file name for CLIP (multimodal model, vision).
  - cont_batching: Enables continuous batching.
  - vision_model: Indicates if the model is a vision model (supports multimodal vision).
  - parameters: Inference parameters:
  - temperature: Controls the model's risk-taking (usually between 0 and 1, but can be higher).
  - token_limit: Maximum possible token output.
  - stream: Determines whether the response uses event streams.
  - stop: Specifies stop words for halting output.
  - frequency_penalty: Applies a penalty based on token repetition.
  - presence_penalty: Applies a penalty for repeated tokens, similar to frequency_penalty but applies equally to all repeated tokens.
  - top_p: A sampling technique to control model determinism.
  - top_k: Limits the model's output to the top-k most probable tokens.
settings 📁: Created when the monitoring-extension is loaded.
- [extension-id] 📁: Stores extension settings with a similar structure to extensions.
- settings.json 📄: Stores the extension setting values:
  - key: The setting key.
  - title: The setting title.
  - description: The setting description.
  - controllerType: The type of setting component (checkbox, input, slider, etc.).
  - controllerProps: Properties of the controller (e.g., value, placeholder, textAlign).
  - extensionName: The extension ID, used to map with the parent extension.
- settings.json 📄: Stores app settings related to GPU acceleration. Generated automatically by the monitoring-extension and cannot be edited, as it is overwritten based on hardware detection logic. This includes NVIDIA-SMI queries to check for drivers and .dll/.so lookups to determine the CUDA version.
themes 📁: Created on app launch (App level).
- [theme-id]: Theme folder containing the theme configuration file:
- theme.json: Theme configuration file:
  - id: The theme's ID.
  - displayName: Theme display name, as seen in theme settings.
  - reduceTransparent: Setting to reduce transparency of the window/background.
  - nativeTheme: Indicates whether the theme depends on the OS's light/dark settings.
  - variables: Contains all possible component configurations.
threads 📁: Created when the conversational-extension is loaded.
- [thread-id] 📁: A folder for thread data, where the ID is generated by ulid().
- thread.json 📄: The thread metadata file:
  - id: Thread's ID (can be generated by the folder name).
  - object: "thread" (OpenAI-compatible field).
  - title: Thread's title (editable in the GUI from the Thread List on the left panel).
  - assistants: Contains cloned assistant metadata and specialized settings for the thread:
  - Includes all assistant settings mentioned under the Jan assistant section above.
  - model: The selected model and its settings/parameters for the thread. Changes made by users to thread settings are written here, rather than in model.json. Also contains the ID and engine of the selected model for quick querying by extensions, as extensions do not have access to all available models (due to isolation).
  - metadata: Additional thread data, such as lastMessage, which provides GUI information but does not use OpenAI-compatible fields.
- messages.jsonl: Array of OpenAI compatible message objects belong to the thread. E.g.
```
   {"id":"01J6Y6FH8PFTHQB5PNJTHEN27C","thread_id":"jan_1725437954","type":"Thread","role":"assistant","content": 
   [{"type":"text","text":{"value":"Hello! Is there something I can help you with or would you like to chat?","annotations": 
   []}}],"status":"ready","created":1725442802966,"updated":1725442802966,"object":"thread.message"}
```

louis-jan commented 1 week ago

Findings:

There are unused (e.g. file_ids, avatar) or computed fields (e.g. ID, object, format, extensions.json) that could be stripped out.
Some are not being used properly, e.g., instructions, and need to be fixed during migration or removed if no longer used. E.g. There is save instructions setting provided.

louis-jan commented 1 week ago

0.6.0 migrations

Important parts that can be managed by Cortex from version 0.6.0

Data folders:

assistants
threads
models
Extensions that are a part of the migration:
Assistant Extension
Conversational Extension
Model Extension

Jan's application level parts that should not be moved:

extensions
logs (Jan + Cortex logs)
settings (Extension settings, designed not to be stored under extensions since it's app data, are placed outside the extensions folder to prevent them from being wiped during an update)
themes (Jan GUI themes - application level)

Concerns:

How can Jan and Cortex extensions coexist in the same folder (future feature support)? The same as settings.

There should be different extensions.json but if we scan extensions folder instead of reading metadata file we can consider package.json engine field where it could work with Jan or Cortex.

WIP

0xSage commented 1 week ago

What's the end goal for Jan application state?

User data, e.g. threads, assistants.
- Does this move towards the SQLite DB?
- Legacy folders like /threads and /assistants would just track legacy data
Cortex level data, e.g. models, engine configs?
- Does this also move towards a DB?
- Or do we replicate data from ~/.cortexcpp into ~/.jan
- Do we make users manually import cortex models or autoscan?
Will Jan have separate folders for stable vs nightly, similar to cortex/
- https://github.com/janhq/jan/issues/2985

marknguyen1302 commented 1 week ago

hi @0xSage ,

about User data, e.g. threads, assistants:
- I think the end goal is to move towards the SQLite DB because it will have better performance, easier to read/write on large file, and the maintenance is easier too, but we will not move it to SQLite in 0.6.0, the main thing of that release is Cortex Platform integration, so we will support current threads, assistants, messages files structure in Cortex Platform to avoid the migration process
- Legacy folders like /threads and /assistants would just track legacy data -> yeah if we do the migration, Legacy folders is for backup, recover during migration and track legacy data
Cortex level data, e.g. models, engine configs
- Does this also move towards a DB? I don't think so, currently, Cortex Platform use file system to manage that (/models, /engines folder)
- Do we make users manually import cortex models or autoscan? sorry but I don't get this question, would you please explain further to me?
Will Jan have separate folders for stable vs nightly, similar to cortex? I think it's a good idea, it will make sure the stable version will not be impacted by the nightly, but user have to sync their models, settings manually between 2 apps. wdyt @louis-jan ?

janhq / jan

Discussion: Jan Data Structures #3541

Overview

Decisions to make

Linked Issues

1654

Findings:

0.6.0 migrations

Important parts that can be managed by Cortex from version 0.6.0

Data folders:

Extensions that are a part of the migration:

Jan's application level parts that should not be moved:

Concerns: