morpheuslord / HackBot

AI-powered cybersecurity chatbot designed to provide helpful and accurate answers to your cybersecurity-related queries and also do code analysis and scan analysis.
275 stars 43 forks source link

Model location, dataset, training. #30

Closed hastalamuerte closed 1 year ago

hastalamuerte commented 1 year ago

1.Hello,llama.bin should stored in same folder with hackbot.py but cant find it anywhere... where it can be? ) But it work. Thanks a lot for your work.

  1. Can it work with other datasets, models (maybe there is some unrestricted models? even with some prompts there is still a lot of filters/censorship . Can you maybe advise some model with RLHF, without restrictions with same lvl . https://huggingface.co/TheBloke/Luna-AI-Llama2-Uncensored-GGML

  2. Can user create own dataset to train on? for example there is a some folder with a lot of scripts , files , etc . can ai analyze code with readme files and get in memory/db . Can locall lama have database and memory of context of dialog . Btw there is a lot of vuln db , cve dbs, exploit dbs. Your trainedmodel, datasets on hfaces, how can be used ? Or provide a url list/ api endpoints with some json content, or streasm to ai to feed up db/memory of dataset/db

  3. There is llama codemodels are there better in code writing vs chat models? What if use Hackbot in 2 modes with different models using. unrestricted code models?

  4. UI with code snippets/webui https://github.com/oobabooga/text-generation-webui https://github.com/liltom-eth/llama2-webui + dialog, context memory can be a helpful .

i am new in ai , and its my first llm running locally(shure localy*?))) )

hastalamuerte commented 1 year ago

C:\Users\User.cache\huggingface\hub\models--localmodels--Llama-2-7B-Chat-ggml\snapshots\07c579e9353aa77cf730a1bc5196c796e41c446c\llama-2-7b-chat.ggmlv3.q4_0.bin found it here.

morpheuslord commented 1 year ago

1.Hello,llama.bin should stored in the same folder with hackbot.py but cant find it anywhere... where it can be? ) But it work. Thanks a lot for your work.

  1. Can it work with other datasets, models (maybe there is some unrestricted models? even with some prompts there is still a lot of filters/censorship . Can you maybe advise some model with RLHF, without restrictions with same lvl . https://huggingface.co/TheBloke/Luna-AI-Llama2-Uncensored-GGML
  2. Can user create own dataset to train on? for example there is a some folder with a lot of scripts , files , etc . can ai analyze code with readme files and get in memory/db . Can locall lama have database and memory of context of dialog . Btw there is a lot of vuln db , cve dbs, exploit dbs. Your trainedmodel, datasets on hfaces, how can be used ? Or provide a url list/ api endpoints with some json content, or streasm to ai to feed up db/memory of dataset/db
  3. There is llama codemodels are there better in code writing vs chat models? What if use Hackbot in 2 modes with different models using. unrestricted code models?
  4. UI with code snippets/webui https://github.com/oobabooga/text-generation-webui https://github.com/liltom-eth/llama2-webui + dialog, context memory can be a helpful .

i am new in ai , and its my first llm running locally(shure localy*?))) )

I am working on the training and custom usability part and another option to use alternative llama models so for the next update I will be adding the custom model and alternative model feature like updating the .env file for that particular feature.

secondly for the AI to analyze a massive script or a lot of data of various scripts and files the AI has no support for that that's the reason I am looking into static code analysis AI for this to be done in an easier way and ya that takes some amount of research.

The feature of alternating the model is a great Idea but not very practical if you don't have a hell lot of GPU and if you have a hosting like the serverless version that makes tasks easy and reliable.

For the WebUI, llama has 2 types of conversations one for a long and more interactive conversation and the next for a short and immediate conversation. If I want to implement this I can but there is always a performance issue you might have noticed there is a difference in accuracy and speed between the local and the runpod version of the AI that's what is I guess the key factor.