Closed failbetter77 closed 5 months ago
Hello, what does the loss look like during your training process? I have only trained for a few dozen epochs so far, but found that the model did not converge. 😕 😕
Hello, what does the loss look like during your training process? I have only trained for a few dozen epochs so far, but found that the model did not converge. 😕 😕
start 0.1(0 epoch)~ drop near 0.02(100epoch) ,
I have only trained for a dozen epochs so far. My loss has been oscillating around 0.02 but has not converged. I suspect that if I continue training for 100 epochs, the loss will still not converge and oscillating around 0.02. 🤔 🤔
I have only trained for a dozen epochs so far. My loss has been oscillating around 0.02 but has not converged. I suspect that if I continue training for 100 epochs, the loss will still not converge and oscillating around 0.02. 🤔 🤔
My graph is the same as yours. ;; how about your result? can you share it?
I have only trained for a dozen epochs so far. My loss has been oscillating around 0.02 but has not converged. I suspect that if I continue training for 100 epochs, the loss will still not converge and oscillating around 0.02. 🤔 🤔
My graph is the same as yours. ;; how about your result? can you share it?
Sorry, I set up to save the model every 50 epochs, but I haven't reached 50 epochs yet. :disappointed::disappointed: I will send you the results when I complete the training.
Thank you for your help.
I'm trying to train on 2xA5000 (2x24GB) with batch-size = 1 but still be CUDA out of memory. Could you share your condition information (GPU, accelerate config, ... ) to train from the initial?
I have set to train with LoRA and got the loss during 30 epochs similar to yours.
I'm trying to train on 2xA5000 (2x24GB) with batch-size = 1 but still be CUDA out of memory. Could you share your condition information (GPU, accelerate config, ... ) to train from the initial?
I have set to train with LoRA and got the loss during 30 epochs similar to yours.
I attempted training with a 4x4090 GPU (24GB), but regardless of how I adjusted the parameters, I encountered an 'OOM' (Out of Memory) error. Now, I'm using A100 GPU for training this model.
这结果也太差了
I attempted training with a 4x4090 GPU (24GB), but regardless of how I adjusted the parameters, I encountered an 'OOM' (Out of Memory) error. Now, I'm using A100 GPU for training this model.
Oh, thanks for your reply. It's probably not possible to train this pipeline with the 24GB GPU.
I'm training from initial.
the result is very not good. I don't know where I went wrong.
train scheduler : { "_class_name": "PNDMScheduler", "_diffusers_version": "0.6.0", "beta_end": 0.012, "beta_schedule": "scaled_linear", "beta_start": 0.00085, "num_train_timesteps": 1000, "set_alpha_to_one": false, "skip_prk_steps": true, "steps_offset": 1, "trained_betas": null, "clip_sample": false }
clip-vit-large-patch14, preprocessor_config.json { "crop_size": 224, "do_center_crop": true, "do_normalize": true, "do_resize": true, "feature_extractor_type": "CLIPFeatureExtractor", "image_mean": [ 0.48145466, 0.4578275, 0.40821073 ], "image_std": [ 0.26862954, 0.26130258, 0.27577711 ], "resample": 3, "size": 224 }