Open ncsQuan opened 9 months ago
Model Choice: Mixtral Dolphin Model Reasoning: Open source (no cost), specific filtering to remove biases, and easy to install and then train. According to the creater of Dolphin Mixtral, "it aims to give you the best chat you've ever had, make it personal, and make it feel like it really cares." Model Drawbacks: You need a lot of ram, 26gb installation Model [Results:]
Next Steps: To actually train the data, I was recommended autoTrain. However, there may be GPU bottlenecks. We can rent hardware on HuggingFace. I'm not completely sure what we can do in this situation. If someone has a computer that has enough GPU to train llm (large language models) then we can attempt that. Otherwise, I do have access to a lab computer with a decent gpu. This may be a viable option. It will take days to train a llm model however.
Recommendations: I believe we should get very specific on what we want our conversation model to look like. A therapist? A friend? A knowledge source? By narrowing down an approach, we can also narrow down on potential training data.
This spike will be to investigate the need or the lack of need for a conversation model. We will focus on existing products only. Please comment and resources used and your final conclusion.
You may have to work along side the assignee that is focusing on the NLP spike.