Closed ajinkya123-robo closed 4 months ago
@ajinkya123-robo Doing well thank you! This one nails junior-v2 as usual but I'm also happy to report some progress on senior: with the Vicuna-1p3-v2b
template this model achieved 56%. Results did seem to be heavily prompt dependent here, falling back to the usual ~25% with the other formats.
Hello @the-crypt-keeper , For training I used the repo made available by DeepSeek-AI. Not sure if it has flaws but model seems to perform on same level as parent model. Very happy with the Senior result. Thank you very much for your time & effort.
Hello @the-crypt-keeper , Trust you are doing good. Can you evaluate https://huggingface.co/ajibawa-2023/Code-290k-6.7B-Instruct as time permits. As discussed I have used DeepSeek-Coder-6.7B-Instruct as a base. It is trained on around 290000 set of codes. Along with Python, Java, JavaScript, GO, C++, Rust, Ruby, Sql, MySql, R, Julia, Haskell, etc. code with detailed explanation is used for training purpose. This model utilises Alpaca format. Thanks