Closed RonanKMcGovern closed 4 months ago
@RonanKMcGovern, not really tbh, it's tailored for llama2. We're (@AleHD, @TJ-Solergibert) working on an extension to llama3, but generally speaking it wouldn't work for any hugging face model.
Noted, thanks. I guess this is a powerful library to quickly test the performance of different datasets.
I see that any(?) hf model can be converted to nanotron format with this script.
Is there documentation describing this format?
Can any model that may be loaded with AutoModelForCausalLM be converted to nanotron format for training?