Open rlasseri opened 1 year ago
Thanks for the feedback! How much epochs did you run? Weirdly enough, I've run into a similar issue but with the inference for novel17 (and it made somewhat sense, since the token distribution is very divergent from the training corpus of the model)
It collapsed after less than 1 epochs (a few hundreds of steps). For me the novel17 with your same exact parameter set is running well.
Hello ! Thanks for this nice work ! I've pretrained several other LLM on analogous French Dataset. However for Falcon glad to discover your guidelins with falconetune. Unfortunately running this on a L40 with the vigogne sample of the notebook i'm indeed getting this probability distribution error However i think that it is coming from the collapse of the loss (going to 0) very quickly. For the novel17 eveyrhing is running smoothly and there indeed the loss is not going to 0. Any thoughts ?