ostris / ai-toolkit

Various AI scripts. Mostly Stable Diffusion stuff.
MIT License
3.23k stars 325 forks source link

How to use locally downloaded models? #121

Open dzy1128 opened 2 months ago

dzy1128 commented 2 months ago

Running 1 process Loading Flux model Loading transformer Error running job: We couldn't connect to 'https://huggingface.co' to load this model, couldn't find it in the cached files and it looks like /data/sdweb/ComfyUI/models/unet/transformer is not the path to a directory containing a config.json file. Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/diffusers/installation#offline-mode'.

======================================== Result:

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/data/sdweb/ai-toolkit/run.py", line 90, in main() File "/data/sdweb/ai-toolkit/run.py", line 86, in main raise e File "/data/sdweb/ai-toolkit/run.py", line 78, in main job.run() File "/data/sdweb/ai-toolkit/jobs/ExtensionJob.py", line 22, in run process.run() File "/data/sdweb/ai-toolkit/jobs/process/BaseSDTrainProcess.py", line 1230, in run self.sd.load_model() File "/data/sdweb/ai-toolkit/toolkit/stable_diffusion_model.py", line 487, in load_model transformer = FluxTransformer2DModel.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/sdweb/ai-toolkit/venv/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn return fn(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^ File "/data/sdweb/ai-toolkit/venv/lib/python3.11/site-packages/diffusers/models/modeling_utils.py", line 612, in from_pretrained config, unused_kwargs, commit_hash = cls.load_config( ^^^^^^^^^^^^^^^^ File "/data/sdweb/ai-toolkit/venv/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn return fn(args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/data/sdweb/ai-toolkit/venv/lib/python3.11/site-packages/diffusers/configuration_utils.py", line 415, in load_config raise EnvironmentError( OSError: We couldn't connect to 'https://huggingface.co' to load this model, couldn't find it in the cached files and it looks like /data/sdweb/ComfyUI/models/unet/transformer is not the path to a directory containing a config.json file. Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/diffusers/installation#offline-mode'.

huggingface model name or path

name_or_path: "/data/sdweb/ComfyUI/models/unet" #Absolute path of the local model Here are the contents of the unet folder: unet

martintomov commented 2 months ago

Check #84

dzy1128 commented 2 months ago

Check #84

Thank you very much for your answer, but I wrote absolute path in name_or_path and it still gives me an error, it will add /transformer after the path: image image

KW-0330 commented 2 months ago

I had the similar problem few hours ago. Since it is getting the model from Hugging face, I will need a token acquired from HF, and permit everything except the billing. I need to put the key under the root folder as well.

dzy1128 commented 2 months ago

I had the similar problem few hours ago. Since it is getting the model from Hugging face, I will need a token acquired from HF, and permit everything except the billing. I need to put the key under the root folder as well.

I can get the models from HF successfully,but the download speed is too slow.

KW-0330 commented 2 months ago

Same here, took me few hours, luckily it is one time only

martintomov commented 2 months ago

Check #84

Thank you very much for your answer, but I wrote absolute path in name_or_path and it still gives me an error, it will add /transformer after the path:

image

image

the absolute path would mean cloning the whole flux repository, not just the models. so you need to git clone https://huggingface.co/black-forest-labs/FLUX.1-dev

tcla75 commented 2 months ago

Check #84

Thank you very much for your answer, but I wrote absolute path in name_or_path and it still gives me an error, it will add /transformer after the path: image image

the absolute path would mean cloning the whole flux repository, not just the models. so you need to git clone https://huggingface.co/black-forest-labs/FLUX.1-dev

Did you try git clone https://huggingface.co/black-forest-labs/FLUX.1-dev before posting that answer here? because that doesn't work for me. it asks for my huggingface username and password which then gives the following error

Password authentication in git is no longer supported. You must use a user access token or an SSH key instead. See https://huggingface.co/blog/password-git-deprecation fatal: Authentication failed for 'https://huggingface.co/black-forest-labs/FLUX.1-dev/'

I followed the guide here https://huggingface.co/blog/password-git-deprecation and it is now downloading for me

martintomov commented 2 months ago

Check #84

Thank you very much for your answer, but I wrote absolute path in name_or_path and it still gives me an error, it will add /transformer after the path:

image

image

the absolute path would mean cloning the whole flux repository, not just the models. so you need to git clone https://huggingface.co/black-forest-labs/FLUX.1-dev

Did you try git clone https://huggingface.co/black-forest-labs/FLUX.1-dev before posting that answer here? because that doesn't work for me. it asks for my huggingface username and password which then gives the following error

Password authentication in git is no longer supported. You must use a user access token or an SSH key instead. See https://huggingface.co/blog/password-git-deprecation

fatal: Authentication failed for 'https://huggingface.co/black-forest-labs/FLUX.1-dev/'

I followed the guide here https://huggingface.co/blog/password-git-deprecation and it is now downloading for me

of course i tried it, the error message is self explanatory really. you need to enter your hugging face username and your access token (not password). READ access token can be generated from https://huggingface.co/settings/tokens

Virtike commented 2 months ago

I'm having similar issues. I cannot get ai-toolkit to download the Flux repository - it downloads 400MB or so of pytorch_model-00001-of-00003.safetensors then download speed drops to 0kb/s and it fails/hangs.

Loading Flux model
Loading transformer
(…)pytorch_model-00001-of-00003.safetensors:   0%|                                         | 0.00/9.98G [00:00<?, ?B/s]

So, I tried cloning the entire black-forest-labs/FLUX.1-dev repository, but now run into a different problem:

Running  1 process
Loading Flux model
Loading transformer
An error occurred while trying to fetch C:\Apps\ai-toolkit\FLUX\transformer: Error no file named diffusion_pytorch_model.safetensors found in directory C:\Apps\ai-toolkit\FLUX\transformer.
Error running job: Error no file named diffusion_pytorch_model.bin found in directory C:\Apps\ai-toolkit\FLUX\transformer.

As per https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main/transformer there is no "diffusion_pytorch_model.bin" or "diffusion_pytorch_model.safetensors" - instead there is diffusion_pytorch_model split across 3x safetensors.

image

tcla75 commented 2 months ago

Check #84

Thank you very much for your answer, but I wrote absolute path in name_or_path and it still gives me an error, it will add /transformer after the path:

image

image

the absolute path would mean cloning the whole flux repository, not just the models. so you need to git clone https://huggingface.co/black-forest-labs/FLUX.1-dev

Did you try git clone https://huggingface.co/black-forest-labs/FLUX.1-dev before posting that answer here? because that doesn't work for me. it asks for my huggingface username and password which then gives the following error Password authentication in git is no longer supported. You must use a user access token or an SSH key instead. See https://huggingface.co/blog/password-git-deprecation fatal: Authentication failed for 'https://huggingface.co/black-forest-labs/FLUX.1-dev/' I followed the guide here https://huggingface.co/blog/password-git-deprecation and it is now downloading for me

of course i tried it, the error message is self explanatory really. you need to enter your hugging face username and your access token (not password). READ access token can be generated from https://huggingface.co/settings/tokens

so after you downloaded it what exactly did you put in the name_or_path in the yaml file. As someone here has already said there is no one file for the absolute path so did you put in C:\SD\training\ai-toolkit\New folder\FLUX.1-dev or C:\SD\training\ai-toolkit\New folder\FLUX.1-dev\transformer or did you copy as path one of the files?

martintomov commented 2 months ago

I'm having similar issues. I cannot get ai-toolkit to download the Flux repository - it downloads 400MB or so of pytorch_model-00001-of-00003.safetensors then download speed drops to 0kb/s and it fails/hangs.


Loading Flux model

Loading transformer

(…)pytorch_model-00001-of-00003.safetensors:   0%|                                         | 0.00/9.98G [00:00<?, ?B/s]

So, I tried cloning the entire black-forest-labs/FLUX.1-dev repository, but now run into a different problem:


Running  1 process

Loading Flux model

Loading transformer

An error occurred while trying to fetch C:\Apps\ai-toolkit\FLUX\transformer: Error no file named diffusion_pytorch_model.safetensors found in directory C:\Apps\ai-toolkit\FLUX\transformer.

Error running job: Error no file named diffusion_pytorch_model.bin found in directory C:\Apps\ai-toolkit\FLUX\transformer.

As per https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main/transformer there is no "diffusion_pytorch_model.bin" or "diffusion_pytorch_model.safetensors" - instead there is diffusion_pytorch_model split across 3x safetensors.

image

...

so after you downloaded it what exactly did you put in the name_or_path in the yaml file. As someone here has already said there is no one file for the absolute path so did you put in C:\SD\training\ai-toolkit\New folder\FLUX.1-dev or C:\SD\training\ai-toolkit\New folder\FLUX.1-dev\transformer or did you copy as path one of the files?

the path shouldn't lead to /transformer but to the WHOLE repository.

here is one of my configs that uses local model path, hope it resolves your questions...(on windows, escape back slashes with another backslash):

{
job: extension
config:
  name: rayban-meta-4000-v2
  process:
  - type: sd_trainer
    training_folder: output
    performance_log_every: 1000
    device: cuda:0
    trigger_word: Ray-Ban Meta Smart Glasses
    network:
      type: lora
      linear: 32
      linear_alpha: 32
    save:
      dtype: float16
      save_every: 250
      max_step_saves_to_keep: 4
    datasets:
    - folder_path: C:\...\dataset
      caption_ext: txt
      caption_dropout_rate: 0.05
      shuffle_tokens: false
      cache_latents_to_disk: true
      resolution:
      - 512
      - 768
      - 1024
    train:
      batch_size: 1
      steps: 4000
      gradient_accumulation_steps: 1
      train_unet: true
      train_text_encoder: false
      gradient_checkpointing: true
      noise_scheduler: flowmatch
      optimizer: adamw8bit
      lr: 0.0002
      linear_timesteps: true
      ema_config:
        use_ema: true
        ema_decay: 0.99
      dtype: bf16
    model:
      name_or_path: C:\...\ai-toolkit\FLUX.1-dev
      is_flux: true
      quantize: true
    sample:
      sampler: flowmatch
      sample_every: 250
      width: 1024
      height: 1024
      prompts:
      - Ray-Ban Meta Smart Glasses, close-up photograph, woman looking up, long blonde
        hair, wearing black-framed glasses, light pink lips, wearing gold earrings,
        navy blue background
      - a woman holding a coffee cup, in a beanie, wearing Ray-Ban Meta Smart Glasses,
        sitting at a cafe
      - a a DJ at a night club, fish eye lens, wearing Ray-Ban Meta Smart Glasses,
        smoke machine, lazer lights, holding a martini
      - a man showing off his cool new t shirt at the beach, wearing Ray-Ban Meta
        Smart Glasses, a shark is jumping out of the water in the background
      - a man wearing Ray-Ban Meta Smart Glasses, in snow covered mountains
      - close up of a man, wearing Ray-Ban Meta Smart Glasses, in a coffee shop, holding
        a cup, smiling
      - Ray-Ban Meta Smart Glasses, close-up photograph, wearing black-framed yellow-tinted
        sunglasses, short brown hair, slight smile, clear blue sky with clouds in
        the background.
      - photo of a man, wearing Ray-Ban Meta Smart Glasses, white background, medium
        shot, modeling clothing, studio lighting, white backdrop
      - a man holding a sign that says, 'Do these glasses fit me?', wearing Ray-Ban
        Meta Smart Glasses
      - Ray-Ban Meta Smart Glasses, close-up of a man adjusting glasses, photograph,
        partial side angle, wearing a green knit beanie and green jacket with visible
        stitching and button, short beard and mustache, silver earring in left ear,
        background features out-of-focus light colors
      neg: ''
      seed: 42
      walk_seed: true
      guidance_scale: 4
      sample_steps: 20
meta:
  name: rayban-meta-4000-v2
  version: '1.0'
}
tcla75 commented 2 months ago

Thanks that worked for me. It's finally training now.

Virtike commented 2 months ago

so after you downloaded it what exactly did you put in the name_or_path in the yaml file. As someone here has already said there is no one file for the absolute path so did you put in C:\SD\training\ai-toolkit\New folder\FLUX.1-dev or C:\SD\training\ai-toolkit\New folder\FLUX.1-dev\transformer or did you copy as path one of the files?

the path shouldn't lead to /transformer but to the WHOLE repository.

Yep, base path for the repository, not just transfomer

model:
        # huggingface model name or path
        # name_or_path: "black-forest-labs/FLUX.1-dev"
        name_or_path: "C:\\Apps\\ai-toolkit\\FLUX"
        is_flux: true
        quantize: true  # run 8bit mixed precision

image So it finds the /transformer path, but not the expected files.

Virtike commented 2 months ago

UPDATE: Got training with locally cloned FLUX.1-dev to work. Had to re-clone the repository, not sure why it didn't work the first time. Perhaps something got corrupted/missed.

Instructions:

  1. Generate a READ access token from https://huggingface.co/settings/tokens/new
  2. From ai-toolkit directory, clone the repository: git clone https://[hf_username]:[token]@huggingface.co/black-forest-labs/FLUX.1-dev (replace [hf_username]:[token] with username and the generated token). This will clone the whole repo into ai-toolkit/FLUX.1-dev
  3. In the training config, under model replace: name_or_path: "black-forest-labs/FLUX.1-dev" with name_or_path: "C:\\...\\ai-toolkit\\FLUX.1-dev" (adjust path to suit)

I found that by git cloning, the download speed is also much faster than letting ai-toolkit cache/download the files itself.

DuroCuri commented 2 months ago

I got the same problem cause I download this flux model seperately with chrome. You should git lfs install all file of flux, thus is quarter than the single safetrensors. Don't download it seperately! Use Git!

Wonderflex commented 2 months ago

Related question - when I just run my config yaml using the default name_or_path: "black-forest-labs/FLUX.1-dev" - where is this downloaded to? I'd like to delete whatever is already downloaded, so I can also use a local path pulled down instead.

heinrichI commented 2 months ago

Related question - when I just run my config yaml using the default name_or_path: "black-forest-labs/FLUX.1-dev" - where is this downloaded to? I'd like to delete whatever is already downloaded, so I can also use a local path pulled down instead.

I download manually from huggingface, and set path to directory: image

Wonderflex commented 2 months ago

Related question - when I just run my config yaml using the default name_or_path: "black-forest-labs/FLUX.1-dev" - where is this downloaded to? I'd like to delete whatever is already downloaded, so I can also use a local path pulled down instead.

I download manually from huggingface, and set path to directory: image

That is the route I plan on going too, but I've already trained a few models with the default method, but I can't find where it stored the model so I can delete it. It doesn't appear to be in the AI-Toolkit folder.

priya-ml-halohues commented 2 months ago

downloaded the model manually image

Screenshot from 2024-08-30 11-15-41

i got this error : safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge Error running job: Error while deserializing header: HeaderTooLarge

jvachez commented 2 months ago

Same problem.

How to use it without downloading all files ? There's not enough space on Kaggle.

Others trainers only need 4 files : flux1-dev-fp8.safetensors, ae.sft, clip_l.safetensors, t5xxl_fp8_e4m3fn.safetensors

Kidzlle commented 1 month ago

hi. I have a strange problem, which no one seemed to have with HF tolken access. Can anyone help please? File "C:\Users....\miniconda3\envs\my-ai-toolkit\Lib\site-packages\diffusers\configuration_utils.py", line 406, in load_config raise EnvironmentError( OSError: black-forest-labs/FLUX.1-dev does not appear to have a file named config.json.

dzy1128 commented 1 month ago

I downloaded everything from this repository. image

Kidzlle commented 1 month ago

Thank you, I see. Somehow config.json is not in the repository anymore, only the model_index.json is still there. So the code gives me error. Shold I make some changes to run.py to make it work? Screenshot 2024-09-10 135328

Kidzlle commented 1 month ago

And, by the way, great tool. Thank you [dzy1128] for sharing it with community.

dzy1128 commented 1 month ago

我也不知道这个文件为啥没了,如果你需要可以把邮箱发给我,我传给你

Kidzlle commented 1 month ago

Thank you. Can you please load it for me in your Discord? Screenshot 2024-09-10 141947

dzy1128 commented 1 month ago

Sorry,I don't use Discord.

Poseidon-fan commented 1 day ago

Thank you, I see. Somehow config.json is not in the repository anymore, only the model_index.json is still there. So the code gives me error. Shold I make some changes to run.py to make it work? Screenshot 2024-09-10 135328

The error message is inaccurate. Actually, the config.json file is under the transformer folder. I have encountered the same problem as you, that's probably because you didn't clone the whole repository, but just flux1-dev.safetensors instead.

You could follow:

git clone https://huggingface.co/black-forest-labs/FLUX.1-dev
cd FLUX.1-dev
git lfs pull

and then modify name_or_path configuration to your model location. This may be helpful in solving the problem.