Closed taeyeonlee closed 6 days ago
Hi @taeyeonlee, thanks for your question. It has to do with the address space limitation of the Hexagon. We hope to abstract this away from the user in the future, but for now we have to manage these splits manually.
Hello, What is the limitation of Hexagon V75 that the Llama v2 7B Quantized model should be split into 8 Bin files ?