Open bmaltais opened 3 months ago
The branch now contain MVP but for some reason the flux1 trainer crash with an Optimizer argument list is empty.
But I am not sure why I keep getting this error when trying to train:
FLUX: Gradient checkpointing enabled.
prepare optimizer, data loader etc.
INFO use 8-bit AdamW optimizer | {} train_util.py:4342
override steps. steps for 4 epochs is / 指定エポックまでのステップ数: 320
enable fp8 training.
Traceback (most recent call last):
File "D:\kohya_ss\sd-scripts\flux_train_network.py", line 395, in <module>
trainer.train(args)
File "D:\kohya_ss\sd-scripts\train_network.py", line 543, in train
if hasattr(t_enc.text_model, "embeddings"):
File "D:\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 1695, in __getattr__
raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'T5EncoderModel' object has no attribute 'text_model'
Maybe I really need to upgrade to PyTorch to 2.4.0... not liking that as this might bork my non Flux.1 GUI... not feeling like upgrading...
But I am not sure why I keep getting this error when trying to train:
FLUX: Gradient checkpointing enabled. prepare optimizer, data loader etc. INFO use 8-bit AdamW optimizer | {} train_util.py:4342 override steps. steps for 4 epochs is / 指定エポックまでのステップ数: 320 enable fp8 training. Traceback (most recent call last): File "D:\kohya_ss\sd-scripts\flux_train_network.py", line 395, in <module> trainer.train(args) File "D:\kohya_ss\sd-scripts\train_network.py", line 543, in train if hasattr(t_enc.text_model, "embeddings"): File "D:\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 1695, in __getattr__ raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'") AttributeError: 'T5EncoderModel' object has no attribute 'text_model'
Maybe I really need to upgrade to PyTorch to 2.4.0... not liking that as this might bork my non Flux.1 GUI... not feeling like upgrading...
Hi there, is it possible to only update PyTorch to 2.4.0 for only the Flux version of Kohya_ss GUI ?
SimpleTuner upadted to v0.9.8: quantised flux training in 40 gig.. 24 gig.. 16 gig... 13.9 gig.. Waiting so much for kohya
SimpleTuner upadted to v0.9.8: quantised flux training in 40 gig.. 24 gig.. 16 gig... 13.9 gig.. Waiting so much for kohya
probably very soon
But I am not sure why I keep getting this error when trying to train:
FLUX: Gradient checkpointing enabled. prepare optimizer, data loader etc. INFO use 8-bit AdamW optimizer | {} train_util.py:4342 override steps. steps for 4 epochs is / 指定エポックまでのステップ数: 320 enable fp8 training. Traceback (most recent call last): File "D:\kohya_ss\sd-scripts\flux_train_network.py", line 395, in <module> trainer.train(args) File "D:\kohya_ss\sd-scripts\train_network.py", line 543, in train if hasattr(t_enc.text_model, "embeddings"): File "D:\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 1695, in __getattr__ raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'") AttributeError: 'T5EncoderModel' object has no attribute 'text_model'
Maybe I really need to upgrade to PyTorch to 2.4.0... not liking that as this might bork my non Flux.1 GUI... not feeling like upgrading...
Hi there! Any news about the integration of Flux into the gui ?
I am running into similar but different errors. Waiting for the as-scripts code to stabilize to further work on it. I have a lot of the elements already in the gui. The missing ones can be added as extra parameters in the Advanced Accordion.
I am running into similar but different errors. Waiting for the as-scripts code to stabilize to further work on it. I have a lot of the elements already in the gui. The missing ones can be added as extra parameters in the Advanced Accordion.
Nice! Should be released soon so, cannot wait to try! Thanks for your work
I updated to the latest sd-script commit for flux... still can't run training at my end unfortunately:
FLUX: Gradient checkpointing enabled.
prepare optimizer, data loader etc.
INFO use 8-bit AdamW optimizer | {} train_util.py:4346
override steps. steps for 4 epochs is / 指定エポックまでのステップ数: 320
enable fp8 training.
Traceback (most recent call last):
File "D:\kohya_ss\sd-scripts\flux_train_network.py", line 397, in <module>
trainer.train(args)
File "D:\kohya_ss\sd-scripts\train_network.py", line 543, in train
if hasattr(t_enc.text_model, "embeddings"):
File "D:\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 1695, in __getattr__
raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'T5EncoderModel' object has no attribute 'text_model'
Here is a copy of my flux1_test.json config if you are interested to poke at it.
I pushed an update with support for the missing GUI parameters for Flux.1.
Here is the latest config for testing based on Kohya's readme config:
I pushed an update with support for the missing GUI parameters for Flux.1.
Here is the latest config for testing based on Kohya's readme config:
thanks a lot
The GUI is a real mess with so many options. I get lost myself when trying to fing where is the option I need to set. Wish there was an easy solution… but I can’t think of one.
The GUI is a real mess with so many options. I get lost myself when trying to fing where is the option I need to set. Wish there was an easy solution… but I can’t think of one.
bro, flux does not support training the text encoder, yet. Set your text encoder lr to 0 That should get you past the error.
Here's some images from a lora i trained
I'm surprised that wasn't more widely known actually. Stability AI, when they released SD3, mentioned that training the T5 model was not only not necessary but not recommended.
The same is likely true for FLux also. It simply relies on the tokenization from the Clip L and the Transformer model working in conjunction with the T5 model's established natural language processing.
And Clip L is almost entirely tag-based and seems highly unstable when trained anyway.
In other words as long as you create the embedding within the model itself, the T5's existing capabilities should be enough to hit the ground running and incorporate that embedding into natural language prompting off the bat.
How exactly this translates to the end result is something I am yet to see myself though.
ai toolkit skript works really great. Trained 3 LORAs so far (in 3 hours) its not perfect but super good. Awaiting for Kohya
Here's some images from a lora i trained
Was it trained using kohya_ss? Great results.
Is anyone getting past the AttributeError: 'T5EncoderModel' object has no attribute 'text_model'? Using the second version of flux1_test, which has lr set to 0 and that doesn't do it.
Is anyone getting past the AttributeError: 'T5EncoderModel' object has no attribute 'text_model'? Using the second version of flux1_test, which has lr set to 0 and that doesn't do it.
Someone implemented a potential fix for it, but Kohya hasn't added it yet: https://github.com/kohya-ss/sd-scripts/issues/1453
The GUI is a real mess with so many options. I get lost myself when trying to fing where is the option I need to set. Wish there was an easy solution… but I can’t think of one.
bro, flux does not support training the text encoder, yet. Set your text encoder lr to 0 That should get you past the error.
Have you an example json?
Someone implemented a potential fix for it, but Kohya hasn't added it yet: kohya-ss/sd-scripts#1453
Koyha has added it now. Pulled these files: library/flux_train_utils.py flux_train_network.py train_network.py library/flux_models.py
but now I get: File "E:\kohya_ss\sd-scripts\flux_train_network.py", line 207, in sample_images accelerator, args, epoch, global_step, flux, ae, text_encoder, self.sample_prompts_te_outputs AttributeError: 'FluxNetworkTrainer' object has no attribute 'sample_prompts_te_outputs'
I can confirm that I am getting the same AttributeError as @jpXerxes after cloning the latest sd3 branch
Able to bypass the issue and begin training by adding --cache_text_encoder_outputs to the additional parameters!
Able to bypass the issue and begin training by adding --cache_text_encoder_outputs to the additional parameters!
That did it. I ran the test file, with sample outputs every 1 epoch (4 epochs) and prompt: a painting of a steam punk skull with a gas mask , by darius kawasaki
These are the 4 sample images:
How are you running the sd3_train,py script with kohya? I downloaded it but don't know what to do with it. I've always just used kohya normally but really want to try some flux training.
How are you running the sd3_train,py script with kohya? I downloaded it but don't know what to do with it. I've always just used kohya normally but really want to try some flux training. Likely you could wait a very short time until bmaltais catches up, but: I'm no expert, but here's what I did: First, make sure you have the proper branch of bmaltais/kohya using git checkout sd3-flux.1
Go to https://github.com/kohya-ss/sd-scripts/tree/sd3 and download the 4 files below, and place them in the appropriate folders: library/flux_train_utils.py flux_train_network.py train_network.py library/flux_models.py
grab the second flux1_test.json posted above in this thread. Edit it to change all hard-coded paths to your own structure. Near the top is the line "additional_parameters": and add to it --cache_text_encoder_outputs
In the gui, add a choice for sample output frequency, and add a prompt
Please, anybody spot something wrong with this please correct me!
A general problem with my test run is that Flux already knows how to deal with that prompt, so I get good images without the Lora. I will have to find something Flux knows nothing about to properly test.
You can also remove the sd scripts directory and replace it with the latest version of the sd3 branch.
On Thu, Aug 15, 2024, 3:09 PM jpXerxes @.***> wrote:
How are you running the sd3_train,py script with kohya? I downloaded it but don't know what to do with it. I've always just used kohya normally but really want to try some flux training. Likely you could wait a very short time until bmaltais catches up, but: I'm no expert, but here's what I did: First, make sure you have the proper branch of bmaltais/kohya using git checkout sd3-flux.1
Go to https://github.com/kohya-ss/sd-scripts/tree/sd3 and download the 4 files below, and place them in the appropriate folders: library/flux_train_utils.py flux_train_network.py train_network.py library/flux_models.py
grab the second flux1_test.json posted above in this thread. Edit it to change all hard-coded paths to your own structure. Near the top is the line "additional_parameters": and add to it --cache_text_encoder_outputs
In the gui, add a choice for sample output frequency, and add a prompt
Please, anybody spot something wrong with this please correct me!
— Reply to this email directly, view it on GitHub https://github.com/bmaltais/kohya_ss/issues/2701#issuecomment-2292017629, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH2WKOZ53CHAUV6BFZJXCATZRT4IDAVCNFSM6AAAAABMKAEAG2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJSGAYTONRSHE . You are receiving this because you commented.Message ID: @.***>
With a 24GB card, I run out of VRAM after about 30 or so training steps.
With a 24GB card, I run out of VRAM after about 30 or so training steps. Hmm. Same 24GB here (4090) and it ran fine through 320 steps. ai-toolkit talks about using lowvram when you "only" have 24GB and some is being used for the display, but like I said it ran fine here.
With a 24GB card, I run out of VRAM after about 30 or so training steps. Hmm. Same 24GB here (4090) and it ran fine through 320 steps. ai-toolkit talks about using lowvram when you "only" have 24GB and some is being used for the display, but like I said it ran fine here.
Haha, then me with 16GB VRAM (4060) good luck to run it
Haha, then me with 16GB VRAM (4060) good luck to run it
Try changing the additional parameters from --highvram to --lowvram. don't know it will work, but can't hurt to try.
How are you running the sd3_train,py script with kohya? I downloaded it but don't know what to do with it. I've always just used kohya normally but really want to try some flux training. Likely you could wait a very short time until bmaltais catches up, but: I'm no expert, but here's what I did: First, make sure you have the proper branch of bmaltais/kohya using git checkout sd3-flux.1
Go to https://github.com/kohya-ss/sd-scripts/tree/sd3 and download the 4 files below, and place them in the appropriate folders: library/flux_train_utils.py flux_train_network.py train_network.py library/flux_models.py
grab the second flux1_test.json posted above in this thread. Edit it to change all hard-coded paths to your own structure. Near the top is the line "additional_parameters": and add to it --cache_text_encoder_outputs
In the gui, add a choice for sample output frequency, and add a prompt
Please, anybody spot something wrong with this please correct me!
That config flux1_test.json doesn't load anything for me within the Kohya GUI under the LoRA tab.
I've edited the file so that it has all local paths set up.
But what are you supposed to actually do with the file?
Haha, then me with 16GB VRAM (4060) good luck to run it
Try changing the additional parameters from --highvram to --lowvram. don't know it will work, but can't hurt to try.
No luck with the lowvram option: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.00 GiB. GPU 0 has a total capacty of 16.00 GiB of which 0 bytes is free. Of the allocated memory 41.91 GiB is allocated by PyTorch, and 3.96 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF steps: 0%| | 0/56 [09:19<?, ?it/s]
I attach my json that I use into Kohya_ss GUI
I realized that I'm actually using a modified version of the first "test" json - I'm getting so many diffrent test files that I can't keep them straight! Here's what I have that actually ran (I still can't say for sure it worked, as the samples probably ignored the lora) flux1_test_jpXerxes.json
Ah, I figured out the issue. Needed to change the resolution to 512,512 to align with the recommendation from SimpleTuner
It's now running at 15.2GB VRAM comfortably and at a similar training speed of 1.3it/s as SimpleTuner, so it should be viable (just barely) for 16GB cards.
Checking/Unchecking "highvram" didn't notably change the vram used.
Also, there is a checkbox for "cache_text_encoder_outputs." You can use that instead of putting it in the extra parameters section.
That doesn't seem to work actually...
When generating samples, the default values seem to work well with euler_a, but are 512x512
Parameters which worked well for 1024x1024 were: --w 1024 --h 1024 --d 42 --l 4.0 --s 25
The important thing to note is that the guidance must be set to something roughly 2 or higher or the output will be garbage.
I realized that I'm actually using a modified version of the first "test" json - I'm getting so many diffrent test files that I can't keep them straight! Here's what I have that actually ran (I still can't say for sure it worked, as the samples probably ignored the lora) flux1_test_jpXerxes.json
I can run it, but always OoM good cleanups between caching models but when go into training epochs, let's f... go taking VRAM and shared VRAM
I realized that I'm actually using a modified version of the first "test" json - I'm getting so many diffrent test files that I can't keep them straight! Here's what I have that actually ran (I still can't say for sure it worked, as the samples probably ignored the lora) flux1_test_jpXerxes.json
I can run it, but always OoM good cleanups between caching models but when go into training epochs, let's f... go taking VRAM and shared VRAM
Change Resolution to 512,512 from 1024,1024
I realized that I'm actually using a modified version of the first "test" json - I'm getting so many diffrent test files that I can't keep them straight! Here's what I have that actually ran (I still can't say for sure it worked, as the samples probably ignored the lora) flux1_test_jpXerxes.json
I can run it, but always OoM good cleanups between caching models but when go into training epochs, let's f... go taking VRAM and shared VRAM
Change Resolution to 512,512 from 1024,1024
here?
I did it to but nothing changed, without sampling, resolution to 512 and max bucket to 1024. It seems chrome.
Wait with 512 and 1024 max bucket:
A present 😂 flux1_test_working.json
I left mine with highvram, but changed to 512 from 1024, and it went from 3.6s/it to 1.1it/s
@FrakerKill Your training batch size is 5. Lower it to 1 and it should work.
I'm using a little over half VRAM on a 3090 when training on 512x512, 1024x1024 OOME
Also, in general, using Euler instead of Euler_a as the sampler results in MUCH better samples.
cache_text_encoder_outputs
I have fixed the issue where the option was not properly served as a parameter. Finally training at 1.09s/it on my 3090
you will basically need quantization model i think for low vram
Latest config that successfully trained for me:
Latest config that successfully trained for me:
Modified mine to make most of the changes you have, and it failed - please try with samples; I had to add the "additional_parameters": " --cache_text_encoder_outputs", to make it work with a sample prompt.
Latest config that successfully trained for me: flux1_test-v2.json
Modified mine to make most of the changes you have, and it failed - please try with samples; I had to add the "additional_parameters": " --cache_text_encoder_outputs", to make it work with a sample prompt.
He fixed the issue with the checkbox before running his config. I imagine if you pull the latest version of the gui, it should work.
He fixed the issue with the checkbox before running his config. I imagine if you pull the latest version of the gui, it should work.
My bad. I didn't see that anything had changed except incorporating the latest sd-scripts, so I failed to pull it.
Not sure what I did wrong but the 2000 step lora has no effect what so ever when used at a 1.0 weight in comfy.
Kohya has added preliminary support for Flux.1 LoRA to his SD3 branch. I have created a
sd3-flux.1
branch and updated to the latest sd-scripts sd3 branch code... No GUI integration yet... I will start adding the basic code to be able to establish that the model is Flux as part of the GUI.