Closed madroidmaq closed 1 month ago
@awni @angeloskath
Based on some experience and implementation with the mlx-lm
library, I've made some adjustments to the current parts related to the flux dataset. I'd like to hear your thoughts.
PS: A lot of changes are merely adjustments to the code's location (such as moving to a new file, etc.), without altering the specific implementation details.
Thanks a lot for the improvements! I like most of them from a quick look on my phone.
We may need to think about how to add prior preservation afterwards cause I was thinking of adding it to the dataset but possibly it was a bad idea.
I will look more closely when I am back on a computer. 🙏
@madroidmaq I made a few changes and added back support for the index.json
approach with a warning to let people know that they should switch to jsonl
. Let me know what you think.
I am still torn about jsonl but I will probably just merge this as is and add another dataset for prior preservation. Something like
# Also a dataset but now we are looking at prior.jsonl instead of train.jsonl
python dreambooth.py .... \
--prior-preservation mlx-community/dog6 --prior-weight 0.5
@angeloskath It is reasonable to do appropriate dataset migration reminders, a good change, thank you
train.jsonl
file as the dataset, keeping consistent with the conventions in themlx_lm
library;fine-tuning whit
mlx-community/dreambooth-dog6