Closed rxng closed 5 days ago
If you are trying this for shared collection, did you try the CLI options?
https://github.com/h2oai/h2ogpt/blob/main/docs/README_LangChain.md#multiple-embeddings-and-sources
i.e.
python generate.py --model_lock="[{'base_model': 'llama', 'model_path_llama': 'Phi-3-mini-4k-instruct-q4.gguf', 'tokenizer_base_model': 'microsoft/Phi-3-mini-4k-instruct'}]" --use_auth_token=$HUGGING_FACE_HUB_TOKEN --langchain_modes="['UserData', 'MyData', 'UserData2']"
Would show all users those 2 by default.
Even if a user logs in that already had a db entry, they will be forced to see those CLI ones.
If the system is online, without restarting, there's currently no way to add to all users at once with e.g. some kind of global user added settings. Is that what you are trying to achieve?
For personal collections, there's no CLI options for that, it's only in the db/json file. By default sqlite3 db is used in newer h2oGPT to address speed issues with json, so one would have to edit the db using operations like in the src/db_utils.py.
I'll think about how to handle this better, probably adding an option to add things via the admin page is best. Would that work for you?
thanks for your quick response! Maybe I was confusing in my explanation. I was trying to achieve having a user logging in and then their own collection would be automatically loaded for them.
However, I tried every single parameter and just found a way to do it via the auth.json file, by adding the line
"langchain_mode": "JonData",
above the selection_docs_state entry, like so
"langchain_mode": "JonData",
"selection_docs_state": {
The only question I have is, if we wanted to then add more documents to the collection via make_db.py , would we then have to restart the entire instance of h2ogpt to automatically use the updated collection?
It would definitely be great if there was an admin page where these things could easily be managed :)
that's so amazing @pseudotensor !!
Note that if you have an auth file that is .json, just pass to CLI that it is now .db and we'll migrate it to .db format that is required for this control
According to the instructions, we can add a make_db.py database to auth.json , but does not specify exactly how to do this.
Then you'll have:
You can add that database to the
auth.json
for their entry if usingauth.json
type file, and they will see when they login.{ "jon": { "password": "jon1306", "userid": "acb8fef1a77d122b5e12b261202ada7a", "selection_docs_state": { "langchain_modes": [ "JonData", "LLM", "Disabled" ], "langchain_mode_types": { "JonData": "personal" } }, "dbs": "users/jon/db_dir_JonData", "load_db_if_exists": "users/jon/db_dir_JonData" } }