kijai / ComfyUI-FluxTrainer

Apache License 2.0
329 stars 11 forks source link

InitFluxLoRATraining Error: not enough values to unpack (expected 3, got 2) #25

Open EuphoricPenguin opened 2 weeks ago

EuphoricPenguin commented 2 weeks ago

image

I see threads #21 and #22 have a similar issue, but none of the fixes have resolved the issue for me. I tried using an old version of ComfyUI, I tried it after I just updated it, I've used a few different file path formats, and I also moved the files into the ComfyUI directory under a datasets folder. I also have the latest version of the code as of a few hours ago on 8/28. Am I missing something here?

Some example file path formats I tried: ../datasets/shiho_so ../datasets/shiho_so/ C:\Users\Evan\Documents\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\datasets\shiho_so ./ComfyUI/datasets/shiho_so/

image This is just one of several file path formats I've tried; this one usually works out fine.

image Here is the problematic node itself and my current settings. I had the first two settings on their defaults and still got the same error.

PS: I would absolutely love to try out this as an option for LoRA training, as the results I got from the ai-toolkits script looked promising from the sample generations but didn't jive with the version of the models that work with the U-Net loader. I think it's either some weird sampler discrepancy or some model difference with Diffusers that's causing the issue over there; in any case, I think training in ComfyUI here makes the most sense if I can manage it.

kijai commented 2 weeks ago

It seems that the root with portable is actually the ComfyUI_windows_portable -folder and not the ComfyUI -folder. So if your dataset is located at C:\Users\Evan\Documents\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\datasets\shiho_so correct way to relatively refer to it is: ComfyUI\datasets\shiho_so

The full absolute path should also work though.

Lecho303 commented 2 weeks ago

i 've got the same error,have u ever solve this problem?my train data is on the C:\user\desktop,i've try to setting the gradient type and save type from "bf16" to "fp16",still got this error.

kijai commented 2 weeks ago

i 've got the same error,have u ever solve this problem?my train data is on the C:\user\desktop,i've try to setting the gradient type and save type from "bf16" to "fp16",still got this error.

It should work if you use the full absolute path, consider placing the dataset somewhere else too though.

Actioninsight commented 2 weeks ago

Same issue. What are the requirements/expectations of a dataset. Do they need to conform to anything in particular? e.g. what is the minimum requirement. will 10 512x512 png's work or do we need captions etc. Not looking for training advice, just the requirement/model of a legal dataset (not just its path). thanks for your work!

EuphoricPenguin commented 2 weeks ago

It seems that the root with portable is actually the ComfyUI_windows_portable -folder and not the ComfyUI -folder. So if your dataset is located at C:\Users\Evan\Documents\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\datasets\shiho_so correct way to relatively refer to it is: ComfyUI\datasets\shiho_so

The full absolute path should also work though.

The full path yielded the same error.

kijai commented 2 weeks ago

It seems that the root with portable is actually the ComfyUI_windows_portable -folder and not the ComfyUI -folder. So if your dataset is located at C:\Users\Evan\Documents\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\datasets\shiho_so correct way to relatively refer to it is: ComfyUI\datasets\shiho_so

The full absolute path should also work though.

The full path yielded the same error.

It might also just be too long path, I'd try something simpler to be sure.

kijai commented 2 weeks ago

Same issue. What are the requirements/expectations of a dataset. Do they need to conform to anything in particular? e.g. what is the minimum requirement. will 10 512x512 png's work or do we need captions etc. Not looking for training advice, just the requirement/model of a legal dataset (not just its path). thanks for your work!

It should work even with just single image, captions are optional but when used need to have same name as the image with as .txt file. Mostly this issue is just about pointing to the dataset folder properly.

francis-j commented 2 weeks ago

I get exactly the same error and have tried different paths and even the absolute path. With images like C:\Apps\ComfyUI_windows_portable\ComfyUI\datasets\face\IMG_0464.jpg, I've put the path as: ComfyUI\datasets\face and C:\Apps\ComfyUI_windows_portable\ComfyUI\datasets\face.

Neither work.

kijai commented 2 weeks ago

I get exactly the same error and have tried different paths and even the absolute path. With images like C:\Apps\ComfyUI_windows_portable\ComfyUI\datasets\face\IMG_0464.jpg, I've put the path as: ComfyUI\datasets\face and C:\Apps\ComfyUI_windows_portable\ComfyUI\datasets\face.

Neither work.

Well I literally tried this exact thing with portable and it worked, so I have no idea. Have you tried loading the latest example workflow, it's also possible there's some mismatch with the node version and the workflow version, if the workflow is older.

kijai commented 2 weeks ago

Looking bit closer, the kohya script doesn't actually stop the processing even if the designated path is not a directory, it just gives a warning in the log. Similarly it does give info in the log if the images are found. I have now changed this to simply report this error (instead of the more cryptic unpack error you see here) if the issue is just in setting the directory:

image

kijai commented 2 weeks ago

This also works fine for me: image

To be sure, you have to input valid directory to each of the datasets you add, even if it's same folder for multires training. If you do not wish to use one of them you need to delete/bypass the node.

Actioninsight commented 2 weeks ago

image image this is the new error w/ new train_utils.py

kijai commented 2 weeks ago

image image this is the new error w/ new train_utils.py

image image this is the new error w/ new train_utils.py

Kinda would suggest there's something wrong with the images themselves, there should be more log entries about it before the error though, such as how many images/captions are in the folder etc.

Actioninsight commented 2 weeks ago

I'll try with different ones and report back


From: Jukka Seppänen @.> Sent: Thursday, August 29, 2024 7:21:32 PM To: kijai/ComfyUI-FluxTrainer @.> Cc: Michael Fraser @.>; Comment @.> Subject: Re: [kijai/ComfyUI-FluxTrainer] InitFluxLoRATraining Error: not enough values to unpack (expected 3, got 2) (Issue #25)

[image]https://private-user-images.githubusercontent.com/158789755/362949694-1e1fcd6d-48cd-4804-8a88-4f33031c5e59.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQ5NzI3ODksIm5iZiI6MTcyNDk3MjQ4OSwicGF0aCI6Ii8xNTg3ODk3NTUvMzYyOTQ5Njk0LTFlMWZjZDZkLTQ4Y2QtNDgwNC04YTg4LTRmMzMwMzFjNWU1OS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwODI5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDgyOVQyMzAxMjlaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT02MGJkNGRiZWIyMjMyNmYzYjM3OGQ2NDhjYjJjNjRlZWQ2MDJiNzg4NWE0NDk5NDgxNzJiMGU0YjljNGZhNTAyJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.vKF7ENspfvdYtcqAhUiCkLKSjecvZPAaXRQHAc3jZjA [image] https://private-user-images.githubusercontent.com/158789755/362949777-737821a3-2f5e-47c9-9db3-195a91e4008f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQ5NzI3ODksIm5iZiI6MTcyNDk3MjQ4OSwicGF0aCI6Ii8xNTg3ODk3NTUvMzYyOTQ5Nzc3LTczNzgyMWEzLTJmNWUtNDdjOS05ZGIzLTE5NWE5MWU0MDA4Zi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwODI5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDgyOVQyMzAxMjlaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1jZDA0NGQ4NGI5M2RmY2RlODE4NTBlZjgxODBkMTA0ODc3ZWY4NmRhNmJhOTc5YTAzYTNkMDZlNDVjNWJjNDJlJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.ZO0vJx8wf7MOTR7OETi0I6g_fyzVPpsDL_iasA2THBI this is the new error w/ new train_utils.py

[image]https://private-user-images.githubusercontent.com/158789755/362949694-1e1fcd6d-48cd-4804-8a88-4f33031c5e59.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQ5NzM5MTMsIm5iZiI6MTcyNDk3MzYxMywicGF0aCI6Ii8xNTg3ODk3NTUvMzYyOTQ5Njk0LTFlMWZjZDZkLTQ4Y2QtNDgwNC04YTg4LTRmMzMwMzFjNWU1OS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwODI5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDgyOVQyMzIwMTNaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1lMzAxMWNhZDc1ZTBmN2NiOGY4ZTZlZWJiM2ZiOGRhYTI2NWE3MDNhMmFkMWRmM2FmNDk5NzM2YzViMDJiNjIyJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.vb-sIOgSn2ccQyCQn7owtsV_sA_L9fa4joZ8ukk6yF8 [image] https://private-user-images.githubusercontent.com/158789755/362949777-737821a3-2f5e-47c9-9db3-195a91e4008f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQ5NzM5MTMsIm5iZiI6MTcyNDk3MzYxMywicGF0aCI6Ii8xNTg3ODk3NTUvMzYyOTQ5Nzc3LTczNzgyMWEzLTJmNWUtNDdjOS05ZGIzLTE5NWE5MWU0MDA4Zi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwODI5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDgyOVQyMzIwMTNaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT01MzEyMWMyNTg3MzkxYzNiNTlmZjQxMmEzMTE4YTFhZjNhYjU3NDVjZWFmYzRlZjdjZDMzMjZhNmNkNWRhZDU5JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.VWG1ccZImNC9h8o6u4Kp3CgUdZQOp5CekTQq6Cktv-E this is the new error w/ new train_utils.py

Kinda would suggest there's something wrong with the images themselves, there should be more log entries about it before the error though, such as how many images/captions are in the folder etc.

— Reply to this email directly, view it on GitHubhttps://github.com/kijai/ComfyUI-FluxTrainer/issues/25#issuecomment-2319395933, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BF3PA67TVT4NEB4EPWOS5KTZT6UHZAVCNFSM6AAAAABNJQJR42VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJZGM4TKOJTGM. You are receiving this because you commented.Message ID: @.***>

EuphoricPenguin commented 2 weeks ago

This also works fine for me: image

To be sure, you have to input valid directory to each of the datasets you add, even if it's same folder for multires training. If you do not wish to use one of them you need to delete/bypass the node.

image image image

Absolute path to extant directory still yields an error.

kijai commented 2 weeks ago

Can you please try with shorter path? The error really is just saying there is no directory at that path, not much else to be done as that's just core python stuff reporting it. Could also be permission issue as it's in Users folder.

kijai commented 2 weeks ago

I have however identified a bug in my nodes: The TrainDatasetAdd -node never resets anything, so if you change settings after running them once it just keeps adding new ones, could account for some of these errors, for now workaround is just restarting Comfy if datasets are modified after running them once.

Edit: I think I fixed this one

Actioninsight commented 2 weeks ago

Thank you! That restart one did it for me. I'd already restarted 10 times, but I was also changing directories constantly. It must've been the case that some of the directories were syntax'd fine but not right after a restart, and I was on bad directory options after the restarts by coincedence. What worked for me was 1) use correct absolute path 2) restart 3) queue prompt right on startup = works (or at least, advances past that issue.) Thank you for your time Kijai.

kijai commented 2 weeks ago

Thank you! That restart one did it for me. I'd already restarted 10 times, but I was also changing directories constantly. It must've been the case that some of the directories were syntax'd fine but not right after a restart, and I was on bad directory options after the restarts by coincedence. What worked for me was 1) use correct absolute path 2) restart 3) queue prompt right on startup = works (or at least, advances past that issue.) Thank you for your time Kijai.

Yeah this makes sense to me now, I wonder why I never ran into it myself, it should be very noticeable in epoch count etc. if you accidentally keep old datasets when adding new, this should now be fixed though. Only remaining issue with that is that removing a dataset node doesn't actually remove the dataset, dunno what to do about that.

EuphoricPenguin commented 2 weeks ago

Can you please try with shorter path? The error really is just saying there is no directory at that path, not much else to be done as that's just core python stuff reporting it. Could also be permission issue as it's in Users folder.

image

So I tried a shorter path, and still no dice. I did notice, however, that I only got the path error when I changed the location of the directory. The rest of the time, I get this error.

kijai commented 2 weeks ago

Can you please try with shorter path? The error really is just saying there is no directory at that path, not much else to be done as that's just core python stuff reporting it. Could also be permission issue as it's in Users folder.

image

So I tried a shorter path, and still no dice. I did notice, however, that I only got the path error when I changed the location of the directory. The rest of the time, I get this error.

Okay the meta tensor error is because you have the diffusers version of the VAE in the loader, that won't work with kohya, it's gotta be the original ae.safetensors -file: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/ae.safetensors

Also I don't think Schnell training is properly supported in kohya, I have not tried it myself though.