giulio98 / functional-diffusion-processes

Official code for Continuous-Time Functional Diffusion Processes (NeurIPS 2023).
https://arxiv.org/abs/2303.00800
Apache License 2.0
16 stars 3 forks source link

can not find sample image #6

Closed ghost closed 4 months ago

ghost commented 6 months ago

Hi, i met a problem when i run sh scripts/maml/sample_mnist.sh. i can not find the sample image. i open the dir ( inr_minst/samples) but it was empty. however , when i run exp_ninist. it will generate image. could you give me some instructions?

best, xinxin

giulio98 commented 5 months ago

Hello, @cindyyyl ! Does the sampled image are correctly displayed on wandb?

ghost commented 5 months ago

Copy that, I saw it , but i find maybe the key problem is i did not run them sucessfully. I stuck in the evaluation part, i try to adjust the number of samples from 10001 to 10000( i think it was a bug ) ,and i am sure that the both dir has file end with xxx.npz. could you give me any instructions?

image
giulio98 commented 4 months ago

Hello @cindyyyl,

Sorry for the late reply. To better assist you, I need that you describe better your issue could you please fill out the following details, please:

  1. Steps to Reproduce:

    • Step 1:
    • Step 2:
    • Step 3:
  2. Expected Behavior:

    • What you expected to happen after completing the steps above.
  3. Actual Behavior:

    • What actually happened. Please include any error messages or screenshots if possible.
  4. Additional Information:

    • Any other details or context you think might be helpful.

Best regards,

Giulio

ghost commented 4 months ago

Hi Giulio,

Sorry for the late reply. the steps of what i did are : [image: image.png] [image: image.png] [image: image.png] [image: image.png] [image: image.png] the bug of fid.compute() i have already solved , it because this function need 3 chanels but the .npy file sampled in evaluation only 1 chanel. however, after i solved this problem. my fid is nan and fid clip was three times of yours exp. I think email communication maybe not a efficient way. do you have time to have a zoom meeting? Besides that, your work is really wonderful ! Thank you !

best, xinxin

Giulio Corallo @.***> 于2024年3月30日周六 11:42写道:

Hello, @cindyyyl https://github.com/cindyyyl ! Does the sampled image are correctly displayed on wandb?

— Reply to this email directly, view it on GitHub https://github.com/giulio98/functional-diffusion-processes/issues/6#issuecomment-2028141232, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY3WWQ44PHFKUTJX3DMR2ADY23FM3AVCNFSM6AAAAABE6MZ37KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRYGE2DCMRTGI . You are receiving this because you were mentioned.Message ID: @.***>

giulio98 commented 4 months ago

Hello @cindyyyl,

It appears there's some confusion regarding the FID calculations for MNIST. The FID Clean and FID Clip scores cannot be computed with the current setup as the feature extractor networks require images with 3 channels. However, for MNIST, we've utilized a pretrained LeNet5, using its penultimate layer to extract features for the standard FID computation.

You can verify in the configuration file that LeNet is specified as the feature extractor for MNIST at this link: https://github.com/giulio98/functional-diffusion-processes/blob/cb13fb08b8f1a5449d627c6c9dd2b964e52852dd/conf/metrics/metrics_mnist.yaml#L8.

Additionally, we have provided the pretrained LeNet5 model weights in the models/lenet5 for reproducibility.

Therefore, it is expected that you won’t be able to obtain the FID CLIP and FID Clean scores—it’s not a bug. Could you confirm if you were able to calculate the standard FID score? If so, I'll initiate a pull request to clarify that FID CLIP and FID Clean scores should not be computed when dealing with single-channel images, to avoid further confusion.

Best regards, Giulio

ghost commented 4 months ago

Hi Giulio,

Thank you so much for you details instructions. It really helps me a lot ! however, i just want to the fid score and do not have much interest for FID clean and clip. For fid score caculation , i did not do any other operations(I do not change the channel of .npz for fid score caculation), just follow the instruction of github. however , the fid score is nan. Thank you again !

best, xinxin

[image: image.png]

Giulio Corallo @.***> 于2024年5月2日周四 05:57写道:

Hello @cindyyyl https://github.com/cindyyyl,

It appears there's some confusion regarding the FID calculations for MNIST. The FID Clean and FID Clip scores cannot be computed with the current setup as the feature extractor networks require images with 3 channels. However, for MNIST, we've utilized a pretrained LeNet5, using its penultimate layer to extract features for the standard FID computation.

You can verify in the configuration file that LeNet is specified as the feature extractor for MNIST at this link: https://github.com/giulio98/functional-diffusion-processes/blob/cb13fb08b8f1a5449d627c6c9dd2b964e52852dd/conf/metrics/metrics_mnist.yaml#L8 .

Additionally, we have provided the pretrained LeNet5 model weights in the models/lenet5 for reproducibility.

Therefore, it is expected that you won’t be able to obtain the FID CLIP and FID Clean scores—it’s not a bug. Could you confirm if you were able to calculate the standard FID score? If so, I'll initiate a pull request to clarify that FID CLIP and FID Clean scores should not be computed when dealing with single-channel images, to avoid further confusion.

Best regards, Giulio

— Reply to this email directly, view it on GitHub https://github.com/giulio98/functional-diffusion-processes/issues/6#issuecomment-2090066138, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY3WWQ5X6QLCOJIHYCDRU3DZAIEYVAVCNFSM6AAAAABE6MZ37KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJQGA3DMMJTHA . You are receiving this because you were mentioned.Message ID: @.***>

giulio98 commented 4 months ago

Hi,

Can you send here a screenshot of a batch for the sampled image for mnist?

ghost commented 4 months ago

HI Giulio,

how about this, i have already zip projects (include the images i sampled in evaluation process). i pull it to your repo?

best, xinxin

Giulio Corallo @.***> 于2024年4月26日周五 19:49写道:

Hello @cindyyyl https://github.com/cindyyyl,

Sorry for the late reply. To better assist you, I need that you describe better your issue could you please fill out the following details, please:

1.

Steps to Reproduce:

  • Step 1:

    • Step 2:
    • Step 3: 2.

    Expected Behavior:

  • What you expected to happen after completing the steps above. 3.

    Actual Behavior:

  • What actually happened. Please include any error messages or screenshots if possible. 4.

    Additional Information:

  • Any other details or context you think might be helpful.

Best regards,

Giulio

— Reply to this email directly, view it on GitHub https://github.com/giulio98/functional-diffusion-processes/issues/6#issuecomment-2080241504, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY3WWQ2ON2RXDPX2PWQ7YF3Y7LRZBAVCNFSM6AAAAABE6MZ37KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBQGI2DCNJQGQ . You are receiving this because you were mentioned.Message ID: @.***>

ghost commented 4 months ago

[image: image.png][image: image.png] [image: image.png]

Giulio Corallo @.***> 于2024年5月3日周五 02:38写道:

Hi,

Can you send here a screenshot of a batch for the sampled image for mnist?

— Reply to this email directly, view it on GitHub https://github.com/giulio98/functional-diffusion-processes/issues/6#issuecomment-2092393818, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY3WWQ5LMYZBJNZ6AZLXJGTZAMWFBAVCNFSM6AAAAABE6MZ37KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJSGM4TGOBRHA . You are receiving this because you were mentioned.Message ID: @.***>

giulio98 commented 4 months ago

Hey sorry I can't see your screenshot. I just see [image.png] can you try again?

ghost commented 4 months ago

Hi Giulio,

Could we talk about it recently or what other information you want to know ? since the DDL of my master thesis is coming soon, r my exp did not have any result as so far \cry\cry.

best, xinxin

On Sat, May 4, 2024 at 10:45 PM cv zx @.***> wrote:

On Sat, May 4, 2024 at 10:04 PM Giulio Corallo @.***> wrote:

Hey sorry I can't see your screenshot. I just see [image.png] can you try again?

— Reply to this email directly, view it on GitHub https://github.com/giulio98/functional-diffusion-processes/issues/6#issuecomment-2094552012, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY3WWQ5RTEI6DE7N7ZR533TZAWHTNAVCNFSM6AAAAABE6MZ37KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJUGU2TEMBRGI . You are receiving this because you were mentioned.Message ID: @.***>

giulio98 commented 4 months ago

Hello, I run the script and i was able to get the FID score for mnist This is my full log:

(fdp) corallo@atlas1:~/PycharmProjects/functional-diffusion-processes$ sh scripts/maml/eval_mnist.sh
2024-05-05 15:39:17.027317: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[2024-05-05 15:39:19,697][HYDRA] Launching 1 jobs locally
[2024-05-05 15:39:19,697][HYDRA]        #0 : +experiments_maml=eval_mnist
[2024-05-05 15:39:19,880][__main__][INFO] - Instantiating <functional_diffusion_processes.datasets.mnist_dataset.MNISTDataset>
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
  warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
[2024-05-05 15:39:20,082][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,082][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,087][absl][INFO] - Load dataset info from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
[2024-05-05 15:39:20,089][__main__][INFO] - Instantiating <functional_diffusion_processes.datasets.mnist_dataset.MNISTDataset>
[2024-05-05 15:39:20,092][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,092][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,092][absl][INFO] - Load dataset info from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
[2024-05-05 15:39:20,093][__main__][INFO] - Instantiating <functional_diffusion_processes.models.mlp_modulation.MLPModulationLR>
[2024-05-05 15:39:20,127][__main__][INFO] - Instantiating <functional_diffusion_processes.sdetools.heat_subvp_sde.HeatSubVPSDE>
[2024-05-05 15:39:20,209][absl][INFO] - Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: 
[2024-05-05 15:39:20,776][absl][INFO] - Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Interpreter Host CUDA
[2024-05-05 15:39:20,777][absl][INFO] - Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client'
[2024-05-05 15:39:20,777][absl][INFO] - Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this.
[2024-05-05 15:39:21,560][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.correctors.langevin_corrector.LangevinCorrector>
[2024-05-05 15:39:21,564][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.predictors.euler_predictor.EulerMaruyamaPredictor>
[2024-05-05 15:39:21,568][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.pc_sampler.PCSampler>
[2024-05-05 15:39:21,571][__main__][INFO] - Instantiating <functional_diffusion_processes.losses.mse_loss.MSELoss>
[2024-05-05 15:39:21,573][__main__][INFO] - Instantiating <functional_diffusion_processes.trainers.trainer.Trainer>
[2024-05-05 15:39:21,591][absl][WARNING] - GlobalAsyncCheckpointManager is not imported correctly. Checkpointing of GlobalDeviceArrays will not be available.To use the feature, install tensorstore.
WARNING:tensorflow:From /home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-05 15:39:23,457][tensorflow][WARNING] - From /home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-05 15:39:23,465][__main__][INFO] - Instantiating <functional_diffusion_processes.metrics.fid_metric.FIDMetric>
[2024-05-05 15:39:23,604][functional_diffusion_processes.metrics.feature_extractor][INFO] - Extracting features from dataset...
[2024-05-05 15:39:23,605][absl][INFO] - Reusing dataset mnist (/home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1)
[2024-05-05 15:39:23,640][absl][INFO] - Constructing tf.data.Dataset mnist for split test, from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
WARNING:tensorflow:AutoGraph could not transform <bound method PercentStyle._format of <logging.PercentStyle object at 0x7fb89752cb80>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'NoneType' object has no attribute '_fields'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
[2024-05-05 15:39:25,504][tensorflow][WARNING] - AutoGraph could not transform <bound method PercentStyle._format of <logging.PercentStyle object at 0x7fb89752cb80>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'NoneType' object has no attribute '_fields'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
[2024-05-05 15:39:24,513][functional_diffusion_processes.datasets.mnist_dataset][INFO] - Converting image to range [0,1]...
[2024-05-05 15:39:25,637][functional_diffusion_processes.datasets.mnist_dataset][INFO] - Resizing image to size 32...
[2024-05-05 15:39:25,661][functional_diffusion_processes.datasets.image_dataset][INFO] - Preprocessing images for split test...
[2024-05-05 15:39:25,682][functional_diffusion_processes.datasets.image_dataset][INFO] - Image reshaped to shape (1024, 1)...
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/jax/_src/lib/xla_bridge.py:544: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
  warnings.warn(
[2024-05-05 15:39:25,912][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 0
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow/python/util/nest.py:917: UserWarning: `tf.layers.flatten` is deprecated and will be removed in a future version. Please use `tf.keras.layers.Flatten` instead.
  structure[0], [func(*x) for x in entries],
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/keras/legacy_tf_layers/base.py:627: UserWarning: `layer.updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
  self.updates, tf.compat.v1.GraphKeys.UPDATE_OPS
[2024-05-05 15:39:27,005][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 1
[2024-05-05 15:39:27,542][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 2
[2024-05-05 15:39:28,074][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 3
[2024-05-05 15:39:28,655][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 4
[2024-05-05 15:39:29,643][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 5
[2024-05-05 15:39:30,365][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 6
[2024-05-05 15:39:30,861][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 7
[2024-05-05 15:39:31,510][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 8
[2024-05-05 15:39:32,186][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 9
[2024-05-05 15:39:32,710][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 10
[2024-05-05 15:39:33,367][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 11
[2024-05-05 15:39:34,191][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 12
[2024-05-05 15:39:34,736][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 13
[2024-05-05 15:39:35,238][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 14
[2024-05-05 15:39:35,738][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 15
[2024-05-05 15:39:36,249][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 16
[2024-05-05 15:39:36,761][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 17
[2024-05-05 15:39:37,254][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 18
[2024-05-05 15:39:37,758][functional_diffusion_processes.metrics.fid_metric][INFO] - Saving real dataset stats to: /home/corallo/PycharmProjects/functional-diffusion-processes/data/stats/mnist_test_stats.npz
[2024-05-05 15:39:37,980][__main__][INFO] - Starting testing!
wandb: Currently logged in as: giulio-corallo (eurecom-ds). Use `wandb login --relogin` to force relogin
wandb: wandb version 0.16.6 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.14.0
wandb: Run data is saved locally in /home/corallo/PycharmProjects/functional-diffusion-processes/wandb/run-20240505_153938-lk5wmrth
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run inr_mnist
wandb: ⭐️ View project at https://wandb.ai/eurecom-ds/fpd
wandb: 🚀 View run at https://wandb.ai/eurecom-ds/fpd/runs/lk5wmrth
[2024-05-05 15:39:48,607][functional_diffusion_processes.trainers.trainer][INFO] - Total number of parameters: 0.12M
[2024-05-05 15:39:48,904][functional_diffusion_processes.trainers.trainer][WARNING] - Resuming training from the latest checkpoint.
[2024-05-05 15:39:48,905][absl][INFO] - Restoring checkpoint from /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/checkpoints/checkpoint_27
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/jax/_src/lib/xla_bridge.py:544: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
  warnings.warn(
[2024-05-05 15:39:48,989][absl][INFO] - Found no checkpoint files in /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist with prefix meta_0_
[2024-05-05 15:39:48,989][functional_diffusion_processes.trainers.trainer][INFO] - Starting sampling loop at step 0.
  0%|                                                                                                                                                                                 | 0/32 [00:00<?, ?it/s][2024-05-05 15:39:48,990][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 0
[2024-05-05 15:41:24,948][absl][INFO] - Saving checkpoint at step: 0
[2024-05-05 15:41:24,952][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_0
  3%|█████▎                                                                                                                                                                   | 1/32 [01:35<49:34, 95.96s/it][2024-05-05 15:41:24,953][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 1
[2024-05-05 15:41:43,396][absl][INFO] - Saving checkpoint at step: 1
[2024-05-05 15:41:43,397][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_1
[2024-05-05 15:41:43,397][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_0
  6%|██████████▌                                                                                                                                                              | 2/32 [01:54<25:10, 50.36s/it][2024-05-05 15:41:43,397][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 2
[2024-05-05 15:42:01,615][absl][INFO] - Saving checkpoint at step: 2
[2024-05-05 15:42:01,618][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_2
[2024-05-05 15:42:01,619][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_1
  9%|███████████████▊                                                                                                                                                         | 3/32 [02:12<17:14, 35.69s/it][2024-05-05 15:42:01,619][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 3
[2024-05-05 15:42:19,896][absl][INFO] - Saving checkpoint at step: 3
[2024-05-05 15:42:19,897][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_3
[2024-05-05 15:42:19,897][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_2
 12%|█████████████████████▏                                                                                                                                                   | 4/32 [02:30<13:26, 28.81s/it][2024-05-05 15:42:19,898][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 4
[2024-05-05 15:42:38,094][absl][INFO] - Saving checkpoint at step: 4
[2024-05-05 15:42:38,096][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_4
[2024-05-05 15:42:38,097][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_3
 16%|██████████████████████████▍                                                                                                                                              | 5/32 [02:49<11:14, 24.99s/it][2024-05-05 15:42:38,097][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 5
[2024-05-05 15:42:56,403][absl][INFO] - Saving checkpoint at step: 5
[2024-05-05 15:42:56,404][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_5
[2024-05-05 15:42:56,404][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_4
 19%|███████████████████████████████▋                                                                                                                                         | 6/32 [03:07<09:50, 22.72s/it][2024-05-05 15:42:56,405][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 6
[2024-05-05 15:43:14,647][absl][INFO] - Saving checkpoint at step: 6
[2024-05-05 15:43:14,648][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_6
[2024-05-05 15:43:14,648][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_5
 22%|████████████████████████████████████▉                                                                                                                                    | 7/32 [03:25<08:51, 21.25s/it][2024-05-05 15:43:14,651][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 7
[2024-05-05 15:43:32,949][absl][INFO] - Saving checkpoint at step: 7
[2024-05-05 15:43:32,956][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_7
[2024-05-05 15:43:32,956][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_6
 25%|██████████████████████████████████████████▎                                                                                                                              | 8/32 [03:43<08:07, 20.32s/it][2024-05-05 15:43:32,957][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 8
[2024-05-05 15:43:51,240][absl][INFO] - Saving checkpoint at step: 8
[2024-05-05 15:43:51,242][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_8
[2024-05-05 15:43:51,244][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_7
 28%|███████████████████████████████████████████████▌                                                                                                                         | 9/32 [04:02<07:32, 19.68s/it][2024-05-05 15:43:51,244][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 9
[2024-05-05 15:44:09,502][absl][INFO] - Saving checkpoint at step: 9
[2024-05-05 15:44:09,505][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_9
[2024-05-05 15:44:09,506][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_8
 31%|████████████████████████████████████████████████████▌                                                                                                                   | 10/32 [04:20<07:03, 19.24s/it][2024-05-05 15:44:09,507][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 10
[2024-05-05 15:44:27,768][absl][INFO] - Saving checkpoint at step: 10
[2024-05-05 15:44:27,776][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_10
[2024-05-05 15:44:27,776][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_9
 34%|█████████████████████████████████████████████████████████▊                                                                                                              | 11/32 [04:38<06:37, 18.95s/it][2024-05-05 15:44:27,777][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 11
[2024-05-05 15:44:45,992][absl][INFO] - Saving checkpoint at step: 11
[2024-05-05 15:44:46,002][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_11
[2024-05-05 15:44:46,004][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_10
 38%|███████████████████████████████████████████████████████████████                                                                                                         | 12/32 [04:57<06:14, 18.73s/it][2024-05-05 15:44:46,005][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 12
[2024-05-05 15:45:04,337][absl][INFO] - Saving checkpoint at step: 12
[2024-05-05 15:45:04,339][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_12
[2024-05-05 15:45:04,340][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_11
 41%|████████████████████████████████████████████████████████████████████▎                                                                                                   | 13/32 [05:15<05:53, 18.61s/it][2024-05-05 15:45:04,341][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 13
[2024-05-05 15:45:22,587][absl][INFO] - Saving checkpoint at step: 13
[2024-05-05 15:45:22,591][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_13
[2024-05-05 15:45:22,594][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_12
 44%|█████████████████████████████████████████████████████████████████████████▌                                                                                              | 14/32 [05:33<05:33, 18.50s/it][2024-05-05 15:45:22,594][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 14
[2024-05-05 15:45:40,872][absl][INFO] - Saving checkpoint at step: 14
[2024-05-05 15:45:40,880][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_14
[2024-05-05 15:45:40,881][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_13
 47%|██████████████████████████████████████████████████████████████████████████████▊                                                                                         | 15/32 [05:51<05:13, 18.44s/it][2024-05-05 15:45:40,881][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 15
[2024-05-05 15:45:59,246][absl][INFO] - Saving checkpoint at step: 15
[2024-05-05 15:45:59,251][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_15
[2024-05-05 15:45:59,252][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_14
 50%|████████████████████████████████████████████████████████████████████████████████████                                                                                    | 16/32 [06:10<04:54, 18.42s/it][2024-05-05 15:45:59,252][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 16
[2024-05-05 15:46:17,534][absl][INFO] - Saving checkpoint at step: 16
[2024-05-05 15:46:17,538][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_16
[2024-05-05 15:46:17,538][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_15
 53%|█████████████████████████████████████████████████████████████████████████████████████████▎                                                                              | 17/32 [06:28<04:35, 18.38s/it][2024-05-05 15:46:17,539][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 17
[2024-05-05 15:46:35,893][absl][INFO] - Saving checkpoint at step: 17
[2024-05-05 15:46:35,899][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_17
[2024-05-05 15:46:35,900][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_16
 56%|██████████████████████████████████████████████████████████████████████████████████████████████▌                                                                         | 18/32 [06:46<04:17, 18.37s/it][2024-05-05 15:46:35,900][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 18
[2024-05-05 15:46:54,148][absl][INFO] - Saving checkpoint at step: 18
[2024-05-05 15:46:54,150][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_18
[2024-05-05 15:46:54,150][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_17
 59%|███████████████████████████████████████████████████████████████████████████████████████████████████▊                                                                    | 19/32 [07:05<03:58, 18.34s/it][2024-05-05 15:46:54,151][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 19
[2024-05-05 15:47:12,467][absl][INFO] - Saving checkpoint at step: 19
[2024-05-05 15:47:12,468][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_19
[2024-05-05 15:47:12,469][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_18
 62%|█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                               | 20/32 [07:23<03:39, 18.33s/it][2024-05-05 15:47:12,469][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 20
[2024-05-05 15:47:30,717][absl][INFO] - Saving checkpoint at step: 20
[2024-05-05 15:47:30,718][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_20
[2024-05-05 15:47:30,719][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_19
 66%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                         | 21/32 [07:41<03:21, 18.31s/it][2024-05-05 15:47:30,719][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 21
[2024-05-05 15:47:49,020][absl][INFO] - Saving checkpoint at step: 21
[2024-05-05 15:47:49,026][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_21
[2024-05-05 15:47:49,026][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_20
 69%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                                                    | 22/32 [08:00<03:03, 18.31s/it][2024-05-05 15:47:49,027][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 22
[2024-05-05 15:48:07,384][absl][INFO] - Saving checkpoint at step: 22
[2024-05-05 15:48:07,386][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_22
[2024-05-05 15:48:07,386][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_21
 72%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                               | 23/32 [08:18<02:44, 18.32s/it][2024-05-05 15:48:07,387][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 23
[2024-05-05 15:48:25,666][absl][INFO] - Saving checkpoint at step: 23
[2024-05-05 15:48:25,671][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_23
[2024-05-05 15:48:25,672][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_22
 75%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                          | 24/32 [08:36<02:26, 18.31s/it][2024-05-05 15:48:25,672][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 24
[2024-05-05 15:48:43,856][absl][INFO] - Saving checkpoint at step: 24
[2024-05-05 15:48:43,861][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_24
[2024-05-05 15:48:43,862][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_23
 78%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                    | 25/32 [08:54<02:07, 18.28s/it][2024-05-05 15:48:43,862][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 25
[2024-05-05 15:49:02,126][absl][INFO] - Saving checkpoint at step: 25
[2024-05-05 15:49:02,128][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_25
[2024-05-05 15:49:02,129][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_24
 81%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                               | 26/32 [09:13<01:49, 18.27s/it][2024-05-05 15:49:02,129][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 26
[2024-05-05 15:49:20,398][absl][INFO] - Saving checkpoint at step: 26
[2024-05-05 15:49:20,402][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_26
[2024-05-05 15:49:20,403][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_25
 84%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                          | 27/32 [09:31<01:31, 18.27s/it][2024-05-05 15:49:20,403][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 27
[2024-05-05 15:49:38,634][absl][INFO] - Saving checkpoint at step: 27
[2024-05-05 15:49:38,636][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_27
[2024-05-05 15:49:38,637][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_26
 88%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                     | 28/32 [09:49<01:13, 18.26s/it][2024-05-05 15:49:38,637][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 28
[2024-05-05 15:49:56,883][absl][INFO] - Saving checkpoint at step: 28
[2024-05-05 15:49:56,885][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_28
[2024-05-05 15:49:56,885][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_27
 91%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎               | 29/32 [10:07<00:54, 18.26s/it][2024-05-05 15:49:56,886][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 29
[2024-05-05 15:50:15,084][absl][INFO] - Saving checkpoint at step: 29
[2024-05-05 15:50:15,089][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_29
[2024-05-05 15:50:15,089][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_28
 94%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌          | 30/32 [10:26<00:36, 18.24s/it][2024-05-05 15:50:15,090][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 30
[2024-05-05 15:50:33,486][absl][INFO] - Saving checkpoint at step: 30
[2024-05-05 15:50:33,487][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_30
[2024-05-05 15:50:33,488][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_29
 97%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊     | 31/32 [10:44<00:18, 18.29s/it][2024-05-05 15:50:33,488][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 31
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [11:02<00:00, 18.31s/it][2024-05-05 15:50:51,953][functional_diffusion_processes.trainers.trainer][INFO] - FID: 1.288000e+00
[2024-05-05 15:50:51,953][functional_diffusion_processes.trainers.trainer][INFO] - Inception score -1.000000e+00
wandb: Waiting for W&B process to finish... (success).
wandb: 
wandb: Run history:
wandb:             FID ▁
wandb: inception score ▁
wandb: 
wandb: Run summary:
wandb:             FID 1.288
wandb: inception score -1.0
wandb: 
wandb: 🚀 View run inr_mnist at: https://wandb.ai/eurecom-ds/fpd/runs/lk5wmrth
wandb: Synced 6 W&B file(s), 32 media file(s), 2 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20240505_153938-lk5wmrth/logs
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [11:12<00:00, 21.01s/it]

Please do the following: Synchronize your local repository with my changes git pull

Carefully check your .env files In my case is for example

export WANDB_API_KEY=<my secret wandb key>
export HOME=/home/corallo/PycharmProjects
export CUDA_HOME=/usr/local/cuda
export PROJECT_ROOT=/home/corallo/PycharmProjects/functional-diffusion-processes
export DATA_ROOT=${PROJECT_ROOT}/data
export LOGS_ROOT=${PROJECT_ROOT}/logs
export TFDS_DATA_DIR=${DATA_ROOT}/tensorflow_datasets
export PYTHONPATH=${PROJECT_ROOT}
export PYTHONUNBUFFERED=1
export HYDRA_FULL_ERROR=1
export WANDB_DISABLE_SERVICE=true
export CUDA_VISIBLE_DEVICES=2

please notice the following, HOME for me is /home/corallo/PycharmProjects because is the directory where is located the project functional-diffusion-processes in your case you have to check where is located yours -- we can't know in advance where is located. Also PROJECT_ROOT for me is /home/corallo/PycharmProjects/functional-diffusion-processes which correspond to my HOME + functional-diffusion-processes

leave the others environment variables as is except you have to fill WANDB_API_KEY with yours and CUDA_VISIBLE_DEVICES with the ids (comma separated) of the GPUs you would like to use.

Let me know if after following this steps you are able to get the FID score.

ghost commented 4 months ago

Hi Giulio,

I hope this email finds you well. I wanted to express my gratitude for your assistance – I've successfully run the FID score! Life can be quite challenging at times, but moments like these make it worthwhile. 😊 However, during the evaluation process in W&B, I noticed that the sampled images contain noise. Could you please advise on the configurations necessary to replicate the experimental results outlined in your paper? [image: image.png] Thank you once again for your help.

Best regards, Xinxin

Giulio Corallo @.***> 于2024年4月26日周五 19:49写道:

Hello @cindyyyl https://github.com/cindyyyl,

Sorry for the late reply. To better assist you, I need that you describe better your issue could you please fill out the following details, please:

1.

Steps to Reproduce:

  • Step 1:

    • Step 2:
    • Step 3: 2.

    Expected Behavior:

  • What you expected to happen after completing the steps above. 3.

    Actual Behavior:

  • What actually happened. Please include any error messages or screenshots if possible. 4.

    Additional Information:

  • Any other details or context you think might be helpful.

Best regards,

Giulio

— Reply to this email directly, view it on GitHub https://github.com/giulio98/functional-diffusion-processes/issues/6#issuecomment-2080241504, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY3WWQ2ON2RXDPX2PWQ7YF3Y7LRZBAVCNFSM6AAAAABE6MZ37KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBQGI2DCNJQGQ . You are receiving this because you were mentioned.Message ID: @.***>

giulio98 commented 4 months ago

Hello,

I can't see the images you share with me. This is what I see from your message: image Anyway this is a batch i get from the provided checkpoint image

ghost commented 4 months ago

sorry about that. how about this time ?

Giulio Corallo @.***> 于2024年5月6日周一 04:48写道:

Hello,

I can't see the images you share with me. This is what I see from your message: image.png (view on web) https://github.com/giulio98/functional-diffusion-processes/assets/79860892/6979042f-ee4c-4221-9420-07945c2ed0c9 Anyway this is a batch i get from the provided checkpoint image.png (view on web) https://github.com/giulio98/functional-diffusion-processes/assets/79860892/864a9630-02ee-4230-883e-62b88c7949a9

— Reply to this email directly, view it on GitHub https://github.com/giulio98/functional-diffusion-processes/issues/6#issuecomment-2095480245, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY3WWQZKQ2ZPTXCMP66DJCLZA47WXAVCNFSM6AAAAABE6MZ37KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJVGQ4DAMRUGU . You are receiving this because you were mentioned.Message ID: @.***>

giulio98 commented 4 months ago

No I can't see it

Can you try directly on GitHub?

ghost commented 4 months ago
image
ghost commented 4 months ago

hahahahaha

giulio98 commented 4 months ago

Hey,

This is not supposed to happen, please run command git pull And let me know if you will get the correct image. You should get something similar to my previous comment

ghost commented 4 months ago

hi ,

you mean git push my changes?

best, xinxin

Giulio Corallo @.***> 于2024年5月6日周一 14:27写道:

Hey,

This is not supposed to happen, please run command git pull And let me know if you will get the correct image. You should get something similar to my previous comment

— Reply to this email directly, view it on GitHub https://github.com/giulio98/functional-diffusion-processes/issues/6#issuecomment-2096653713, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY3WWQ2FP27ZJI53BUJJFC3ZA7DSDAVCNFSM6AAAAABE6MZ37KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJWGY2TGNZRGM . You are receiving this because you were mentioned.Message ID: @.***>

giulio98 commented 4 months ago

No i mean to pull my changes git pull

ghost commented 4 months ago

i have already git pull your changes last time

Giulio Corallo @.***> 于2024年5月7日周二 04:46写道:

No i mean to pull my changes git pull

— Reply to this email directly, view it on GitHub https://github.com/giulio98/functional-diffusion-processes/issues/6#issuecomment-2097772770, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY3WWQ4XPJDGSXNMSIOYGULZBCIIFAVCNFSM6AAAAABE6MZ37KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJXG43TENZXGA . You are receiving this because you were mentioned.Message ID: @.***>

giulio98 commented 4 months ago

I'm unable to reproduce your experiment, if i run sh scripts/maml/eval_mnist.sh

with this .env file

export WANDB_API_KEY=<my secret wandb key>
export HOME=/home/corallo/PycharmProjects
export CUDA_HOME=/usr/local/cuda
export PROJECT_ROOT=/home/corallo/PycharmProjects/functional-diffusion-processes
export DATA_ROOT=${PROJECT_ROOT}/data
export LOGS_ROOT=${PROJECT_ROOT}/logs
export TFDS_DATA_DIR=${DATA_ROOT}/tensorflow_datasets
export PYTHONPATH=${PROJECT_ROOT}
export PYTHONUNBUFFERED=1
export HYDRA_FULL_ERROR=1
export WANDB_DISABLE_SERVICE=true
export CUDA_VISIBLE_DEVICES=2

I get this logs:

(fdp) corallo@atlas1:~/PycharmProjects/functional-diffusion-processes$ sh scripts/maml/eval_mnist.sh
2024-05-05 15:39:17.027317: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[2024-05-05 15:39:19,697][HYDRA] Launching 1 jobs locally
[2024-05-05 15:39:19,697][HYDRA]        #0 : +experiments_maml=eval_mnist
[2024-05-05 15:39:19,880][__main__][INFO] - Instantiating <functional_diffusion_processes.datasets.mnist_dataset.MNISTDataset>
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
  warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
[2024-05-05 15:39:20,082][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,082][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,087][absl][INFO] - Load dataset info from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
[2024-05-05 15:39:20,089][__main__][INFO] - Instantiating <functional_diffusion_processes.datasets.mnist_dataset.MNISTDataset>
[2024-05-05 15:39:20,092][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,092][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,092][absl][INFO] - Load dataset info from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
[2024-05-05 15:39:20,093][__main__][INFO] - Instantiating <functional_diffusion_processes.models.mlp_modulation.MLPModulationLR>
[2024-05-05 15:39:20,127][__main__][INFO] - Instantiating <functional_diffusion_processes.sdetools.heat_subvp_sde.HeatSubVPSDE>
[2024-05-05 15:39:20,209][absl][INFO] - Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: 
[2024-05-05 15:39:20,776][absl][INFO] - Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Interpreter Host CUDA
[2024-05-05 15:39:20,777][absl][INFO] - Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client'
[2024-05-05 15:39:20,777][absl][INFO] - Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this.
[2024-05-05 15:39:21,560][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.correctors.langevin_corrector.LangevinCorrector>
[2024-05-05 15:39:21,564][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.predictors.euler_predictor.EulerMaruyamaPredictor>
[2024-05-05 15:39:21,568][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.pc_sampler.PCSampler>
[2024-05-05 15:39:21,571][__main__][INFO] - Instantiating <functional_diffusion_processes.losses.mse_loss.MSELoss>
[2024-05-05 15:39:21,573][__main__][INFO] - Instantiating <functional_diffusion_processes.trainers.trainer.Trainer>
[2024-05-05 15:39:21,591][absl][WARNING] - GlobalAsyncCheckpointManager is not imported correctly. Checkpointing of GlobalDeviceArrays will not be available.To use the feature, install tensorstore.
WARNING:tensorflow:From /home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-05 15:39:23,457][tensorflow][WARNING] - From /home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-05 15:39:23,465][__main__][INFO] - Instantiating <functional_diffusion_processes.metrics.fid_metric.FIDMetric>
[2024-05-05 15:39:23,604][functional_diffusion_processes.metrics.feature_extractor][INFO] - Extracting features from dataset...
[2024-05-05 15:39:23,605][absl][INFO] - Reusing dataset mnist (/home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1)
[2024-05-05 15:39:23,640][absl][INFO] - Constructing tf.data.Dataset mnist for split test, from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
WARNING:tensorflow:AutoGraph could not transform <bound method PercentStyle._format of <logging.PercentStyle object at 0x7fb89752cb80>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'NoneType' object has no attribute '_fields'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
[2024-05-05 15:39:25,504][tensorflow][WARNING] - AutoGraph could not transform <bound method PercentStyle._format of <logging.PercentStyle object at 0x7fb89752cb80>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'NoneType' object has no attribute '_fields'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
[2024-05-05 15:39:24,513][functional_diffusion_processes.datasets.mnist_dataset][INFO] - Converting image to range [0,1]...
[2024-05-05 15:39:25,637][functional_diffusion_processes.datasets.mnist_dataset][INFO] - Resizing image to size 32...
[2024-05-05 15:39:25,661][functional_diffusion_processes.datasets.image_dataset][INFO] - Preprocessing images for split test...
[2024-05-05 15:39:25,682][functional_diffusion_processes.datasets.image_dataset][INFO] - Image reshaped to shape (1024, 1)...
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/jax/_src/lib/xla_bridge.py:544: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
  warnings.warn(
[2024-05-05 15:39:25,912][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 0
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow/python/util/nest.py:917: UserWarning: `tf.layers.flatten` is deprecated and will be removed in a future version. Please use `tf.keras.layers.Flatten` instead.
  structure[0], [func(*x) for x in entries],
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/keras/legacy_tf_layers/base.py:627: UserWarning: `layer.updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
  self.updates, tf.compat.v1.GraphKeys.UPDATE_OPS
[2024-05-05 15:39:27,005][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 1
[2024-05-05 15:39:27,542][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 2
[2024-05-05 15:39:28,074][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 3
[2024-05-05 15:39:28,655][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 4
[2024-05-05 15:39:29,643][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 5
[2024-05-05 15:39:30,365][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 6
[2024-05-05 15:39:30,861][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 7
[2024-05-05 15:39:31,510][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 8
[2024-05-05 15:39:32,186][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 9
[2024-05-05 15:39:32,710][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 10
[2024-05-05 15:39:33,367][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 11
[2024-05-05 15:39:34,191][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 12
[2024-05-05 15:39:34,736][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 13
[2024-05-05 15:39:35,238][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 14
[2024-05-05 15:39:35,738][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 15
[2024-05-05 15:39:36,249][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 16
[2024-05-05 15:39:36,761][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 17
[2024-05-05 15:39:37,254][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 18
[2024-05-05 15:39:37,758][functional_diffusion_processes.metrics.fid_metric][INFO] - Saving real dataset stats to: /home/corallo/PycharmProjects/functional-diffusion-processes/data/stats/mnist_test_stats.npz
[2024-05-05 15:39:37,980][__main__][INFO] - Starting testing!
wandb: Currently logged in as: giulio-corallo (eurecom-ds). Use `wandb login --relogin` to force relogin
wandb: wandb version 0.16.6 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.14.0
wandb: Run data is saved locally in /home/corallo/PycharmProjects/functional-diffusion-processes/wandb/run-20240505_153938-lk5wmrth
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run inr_mnist
wandb: ⭐️ View project at https://wandb.ai/eurecom-ds/fpd
wandb: 🚀 View run at https://wandb.ai/eurecom-ds/fpd/runs/lk5wmrth
[2024-05-05 15:39:48,607][functional_diffusion_processes.trainers.trainer][INFO] - Total number of parameters: 0.12M
[2024-05-05 15:39:48,904][functional_diffusion_processes.trainers.trainer][WARNING] - Resuming training from the latest checkpoint.
[2024-05-05 15:39:48,905][absl][INFO] - Restoring checkpoint from /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/checkpoints/checkpoint_27
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/jax/_src/lib/xla_bridge.py:544: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
  warnings.warn(
[2024-05-05 15:39:48,989][absl][INFO] - Found no checkpoint files in /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist with prefix meta_0_
[2024-05-05 15:39:48,989][functional_diffusion_processes.trainers.trainer][INFO] - Starting sampling loop at step 0.
  0%|                                                                                                                                                                                 | 0/32 [00:00<?, ?it/s][2024-05-05 15:39:48,990][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 0
[2024-05-05 15:41:24,948][absl][INFO] - Saving checkpoint at step: 0
[2024-05-05 15:41:24,952][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_0
  3%|█████▎                                                                                                                                                                   | 1/32 [01:35<49:34, 95.96s/it][2024-05-05 15:41:24,953][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 1
[2024-05-05 15:41:43,396][absl][INFO] - Saving checkpoint at step: 1
[2024-05-05 15:41:43,397][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_1
[2024-05-05 15:41:43,397][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_0
  6%|██████████▌                                                                                                                                                              | 2/32 [01:54<25:10, 50.36s/it][2024-05-05 15:41:43,397][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 2
[2024-05-05 15:42:01,615][absl][INFO] - Saving checkpoint at step: 2
[2024-05-05 15:42:01,618][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_2
[2024-05-05 15:42:01,619][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_1
  9%|███████████████▊                                                                                                                                                         | 3/32 [02:12<17:14, 35.69s/it][2024-05-05 15:42:01,619][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 3
[2024-05-05 15:42:19,896][absl][INFO] - Saving checkpoint at step: 3
[2024-05-05 15:42:19,897][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_3
[2024-05-05 15:42:19,897][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_2
 12%|█████████████████████▏                                                                                                                                                   | 4/32 [02:30<13:26, 28.81s/it][2024-05-05 15:42:19,898][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 4
[2024-05-05 15:42:38,094][absl][INFO] - Saving checkpoint at step: 4
[2024-05-05 15:42:38,096][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_4
[2024-05-05 15:42:38,097][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_3
 16%|██████████████████████████▍                                                                                                                                              | 5/32 [02:49<11:14, 24.99s/it][2024-05-05 15:42:38,097][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 5
[2024-05-05 15:42:56,403][absl][INFO] - Saving checkpoint at step: 5
[2024-05-05 15:42:56,404][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_5
[2024-05-05 15:42:56,404][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_4
 19%|███████████████████████████████▋                                                                                                                                         | 6/32 [03:07<09:50, 22.72s/it][2024-05-05 15:42:56,405][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 6
[2024-05-05 15:43:14,647][absl][INFO] - Saving checkpoint at step: 6
[2024-05-05 15:43:14,648][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_6
[2024-05-05 15:43:14,648][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_5
 22%|████████████████████████████████████▉                                                                                                                                    | 7/32 [03:25<08:51, 21.25s/it][2024-05-05 15:43:14,651][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 7
[2024-05-05 15:43:32,949][absl][INFO] - Saving checkpoint at step: 7
[2024-05-05 15:43:32,956][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_7
[2024-05-05 15:43:32,956][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_6
 25%|██████████████████████████████████████████▎                                                                                                                              | 8/32 [03:43<08:07, 20.32s/it][2024-05-05 15:43:32,957][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 8
[2024-05-05 15:43:51,240][absl][INFO] - Saving checkpoint at step: 8
[2024-05-05 15:43:51,242][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_8
[2024-05-05 15:43:51,244][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_7
 28%|███████████████████████████████████████████████▌                                                                                                                         | 9/32 [04:02<07:32, 19.68s/it][2024-05-05 15:43:51,244][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 9
[2024-05-05 15:44:09,502][absl][INFO] - Saving checkpoint at step: 9
[2024-05-05 15:44:09,505][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_9
[2024-05-05 15:44:09,506][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_8
 31%|████████████████████████████████████████████████████▌                                                                                                                   | 10/32 [04:20<07:03, 19.24s/it][2024-05-05 15:44:09,507][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 10
[2024-05-05 15:44:27,768][absl][INFO] - Saving checkpoint at step: 10
[2024-05-05 15:44:27,776][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_10
[2024-05-05 15:44:27,776][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_9
 34%|█████████████████████████████████████████████████████████▊                                                                                                              | 11/32 [04:38<06:37, 18.95s/it][2024-05-05 15:44:27,777][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 11
[2024-05-05 15:44:45,992][absl][INFO] - Saving checkpoint at step: 11
[2024-05-05 15:44:46,002][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_11
[2024-05-05 15:44:46,004][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_10
 38%|███████████████████████████████████████████████████████████████                                                                                                         | 12/32 [04:57<06:14, 18.73s/it][2024-05-05 15:44:46,005][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 12
[2024-05-05 15:45:04,337][absl][INFO] - Saving checkpoint at step: 12
[2024-05-05 15:45:04,339][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_12
[2024-05-05 15:45:04,340][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_11
 41%|████████████████████████████████████████████████████████████████████▎                                                                                                   | 13/32 [05:15<05:53, 18.61s/it][2024-05-05 15:45:04,341][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 13
[2024-05-05 15:45:22,587][absl][INFO] - Saving checkpoint at step: 13
[2024-05-05 15:45:22,591][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_13
[2024-05-05 15:45:22,594][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_12
 44%|█████████████████████████████████████████████████████████████████████████▌                                                                                              | 14/32 [05:33<05:33, 18.50s/it][2024-05-05 15:45:22,594][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 14
[2024-05-05 15:45:40,872][absl][INFO] - Saving checkpoint at step: 14
[2024-05-05 15:45:40,880][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_14
[2024-05-05 15:45:40,881][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_13
 47%|██████████████████████████████████████████████████████████████████████████████▊                                                                                         | 15/32 [05:51<05:13, 18.44s/it][2024-05-05 15:45:40,881][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 15
[2024-05-05 15:45:59,246][absl][INFO] - Saving checkpoint at step: 15
[2024-05-05 15:45:59,251][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_15
[2024-05-05 15:45:59,252][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_14
 50%|████████████████████████████████████████████████████████████████████████████████████                                                                                    | 16/32 [06:10<04:54, 18.42s/it][2024-05-05 15:45:59,252][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 16
[2024-05-05 15:46:17,534][absl][INFO] - Saving checkpoint at step: 16
[2024-05-05 15:46:17,538][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_16
[2024-05-05 15:46:17,538][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_15
 53%|█████████████████████████████████████████████████████████████████████████████████████████▎                                                                              | 17/32 [06:28<04:35, 18.38s/it][2024-05-05 15:46:17,539][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 17
[2024-05-05 15:46:35,893][absl][INFO] - Saving checkpoint at step: 17
[2024-05-05 15:46:35,899][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_17
[2024-05-05 15:46:35,900][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_16
 56%|██████████████████████████████████████████████████████████████████████████████████████████████▌                                                                         | 18/32 [06:46<04:17, 18.37s/it][2024-05-05 15:46:35,900][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 18
[2024-05-05 15:46:54,148][absl][INFO] - Saving checkpoint at step: 18
[2024-05-05 15:46:54,150][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_18
[2024-05-05 15:46:54,150][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_17
 59%|███████████████████████████████████████████████████████████████████████████████████████████████████▊                                                                    | 19/32 [07:05<03:58, 18.34s/it][2024-05-05 15:46:54,151][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 19
[2024-05-05 15:47:12,467][absl][INFO] - Saving checkpoint at step: 19
[2024-05-05 15:47:12,468][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_19
[2024-05-05 15:47:12,469][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_18
 62%|█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                               | 20/32 [07:23<03:39, 18.33s/it][2024-05-05 15:47:12,469][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 20
[2024-05-05 15:47:30,717][absl][INFO] - Saving checkpoint at step: 20
[2024-05-05 15:47:30,718][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_20
[2024-05-05 15:47:30,719][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_19
 66%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                         | 21/32 [07:41<03:21, 18.31s/it][2024-05-05 15:47:30,719][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 21
[2024-05-05 15:47:49,020][absl][INFO] - Saving checkpoint at step: 21
[2024-05-05 15:47:49,026][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_21
[2024-05-05 15:47:49,026][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_20
 69%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                                                    | 22/32 [08:00<03:03, 18.31s/it][2024-05-05 15:47:49,027][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 22
[2024-05-05 15:48:07,384][absl][INFO] - Saving checkpoint at step: 22
[2024-05-05 15:48:07,386][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_22
[2024-05-05 15:48:07,386][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_21
 72%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                               | 23/32 [08:18<02:44, 18.32s/it][2024-05-05 15:48:07,387][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 23
[2024-05-05 15:48:25,666][absl][INFO] - Saving checkpoint at step: 23
[2024-05-05 15:48:25,671][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_23
[2024-05-05 15:48:25,672][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_22
 75%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                          | 24/32 [08:36<02:26, 18.31s/it][2024-05-05 15:48:25,672][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 24
[2024-05-05 15:48:43,856][absl][INFO] - Saving checkpoint at step: 24
[2024-05-05 15:48:43,861][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_24
[2024-05-05 15:48:43,862][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_23
 78%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                    | 25/32 [08:54<02:07, 18.28s/it][2024-05-05 15:48:43,862][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 25
[2024-05-05 15:49:02,126][absl][INFO] - Saving checkpoint at step: 25
[2024-05-05 15:49:02,128][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_25
[2024-05-05 15:49:02,129][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_24
 81%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                               | 26/32 [09:13<01:49, 18.27s/it][2024-05-05 15:49:02,129][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 26
[2024-05-05 15:49:20,398][absl][INFO] - Saving checkpoint at step: 26
[2024-05-05 15:49:20,402][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_26
[2024-05-05 15:49:20,403][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_25
 84%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                          | 27/32 [09:31<01:31, 18.27s/it][2024-05-05 15:49:20,403][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 27
[2024-05-05 15:49:38,634][absl][INFO] - Saving checkpoint at step: 27
[2024-05-05 15:49:38,636][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_27
[2024-05-05 15:49:38,637][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_26
 88%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                     | 28/32 [09:49<01:13, 18.26s/it][2024-05-05 15:49:38,637][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 28
[2024-05-05 15:49:56,883][absl][INFO] - Saving checkpoint at step: 28
[2024-05-05 15:49:56,885][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_28
[2024-05-05 15:49:56,885][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_27
 91%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎               | 29/32 [10:07<00:54, 18.26s/it][2024-05-05 15:49:56,886][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 29
[2024-05-05 15:50:15,084][absl][INFO] - Saving checkpoint at step: 29
[2024-05-05 15:50:15,089][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_29
[2024-05-05 15:50:15,089][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_28
 94%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌          | 30/32 [10:26<00:36, 18.24s/it][2024-05-05 15:50:15,090][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 30
[2024-05-05 15:50:33,486][absl][INFO] - Saving checkpoint at step: 30
[2024-05-05 15:50:33,487][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_30
[2024-05-05 15:50:33,488][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_29
 97%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊     | 31/32 [10:44<00:18, 18.29s/it][2024-05-05 15:50:33,488][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 31
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [11:02<00:00, 18.31s/it][2024-05-05 15:50:51,953][functional_diffusion_processes.trainers.trainer][INFO] - FID: 1.288000e+00
[2024-05-05 15:50:51,953][functional_diffusion_processes.trainers.trainer][INFO] - Inception score -1.000000e+00
wandb: Waiting for W&B process to finish... (success).
wandb: 
wandb: Run history:
wandb:             FID ▁
wandb: inception score ▁
wandb: 
wandb: Run summary:
wandb:             FID 1.288
wandb: inception score -1.0
wandb: 
wandb: 🚀 View run inr_mnist at: https://wandb.ai/eurecom-ds/fpd/runs/lk5wmrth
wandb: Synced 6 W&B file(s), 32 media file(s), 2 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20240505_153938-lk5wmrth/logs
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [11:12<00:00, 21.01s/it]

and this are the sampled image on wandb image

Did you follow the exact same steps? Do you get the same logs as mine? Please share the content of your .env file (obscuring your wandb api key) cat .env and your full logs

ghost commented 4 months ago

export WANDB_API_KEY= export HOME=/cis/net/io93c/data/shuan124/ export CUDA_HOME=/usr/local/cuda export PROJECT_ROOT=/cis/net/io93c/data/shuan124/functional-diffusion-processes # /home/username/functional_diffusion_processes export DATA_ROOT=${PROJECT_ROOT}/data export LOGS_ROOT=${PROJECT_ROOT}/logs export TFDS_DATA_DIR= ${DATA_ROOT}/tensorflow_datasets export PYTHONPATH=${PROJECT_ROOT} export PYTHONUNBUFFERED=1 export HYDRA_FULL_ERROR=1 export WANDB_DISABLE_SERVICE=true export CUDA_VISIBLE_DEVICES=6

ghost commented 4 months ago

zxcvzxcv980234Projectsthird_timeRunsinr_mnistLogs Invite teammates

cindyyyl Personal

Overview Workspace System Logs Files Artifacts Search logs Download 93 [2024-05-07 18:05:18,543][absl][INFO] - Saving checkpoint at step: 20 94 [2024-05-07 18:05:18,563][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_20 95 [2024-05-07 18:05:18,564][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_19 96 [2024-05-07 18:05:18,568][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 21 97 [2024-05-07 18:05:56,161][absl][INFO] - Saving checkpoint at step: 21 98 [2024-05-07 18:05:56,183][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_21 99 [2024-05-07 18:05:56,185][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_20 100 [2024-05-07 18:05:56,189][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 22 101 [2024-05-07 18:06:33,751][absl][INFO] - Saving checkpoint at step: 22 102 [2024-05-07 18:06:33,769][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_22 103 [2024-05-07 18:06:33,771][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_21 104 [2024-05-07 18:06:33,775][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 23 105 [2024-05-07 18:07:11,030][absl][INFO] - Saving checkpoint at step: 23 106 [2024-05-07 18:07:11,049][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_23 107 [2024-05-07 18:07:11,051][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_22 108 [2024-05-07 18:07:11,055][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 24 109 [2024-05-07 18:07:48,711][absl][INFO] - Saving checkpoint at step: 24 110 [2024-05-07 18:07:48,730][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_24 111 [2024-05-07 18:07:48,732][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_23 112 [2024-05-07 18:07:48,737][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 25 113 [2024-05-07 18:08:26,313][absl][INFO] - Saving checkpoint at step: 25 114 [2024-05-07 18:08:26,331][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_25 115 [2024-05-07 18:08:26,333][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_24 116 [2024-05-07 18:08:26,337][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 26 117 [2024-05-07 18:09:03,872][absl][INFO] - Saving checkpoint at step: 26 118 [2024-05-07 18:09:03,889][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_26 119 [2024-05-07 18:09:03,891][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_25 120 [2024-05-07 18:09:03,895][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 27 121 [2024-05-07 18:09:41,522][absl][INFO] - Saving checkpoint at step: 27 122 [2024-05-07 18:09:41,543][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_27 123 [2024-05-07 18:09:41,545][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_26 124 [2024-05-07 18:09:41,549][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 28 125 [2024-05-07 18:10:19,344][absl][INFO] - Saving checkpoint at step: 28 126 [2024-05-07 18:10:19,390][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_28 127 [2024-05-07 18:10:19,393][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_27 128 [2024-05-07 18:10:19,397][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 29 129 [2024-05-07 18:10:56,875][absl][INFO] - Saving checkpoint at step: 29 130 [2024-05-07 18:10:56,894][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_29 131 [2024-05-07 18:10:56,896][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_28 132 [2024-05-07 18:10:56,901][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 30 133 [2024-05-07 18:11:34,635][absl][INFO] - Saving checkpoint at step: 30 134 [2024-05-07 18:11:34,654][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_30 135 [2024-05-07 18:11:34,656][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_29 136 [2024-05-07 18:11:34,661][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 31 137 [2024-05-07 18:12:12,547][functional_diffusion_processes.trainers.trainer][INFO] - FID: 1.020612e+02 138 [2024-05-07 18:12:12,547][functional_diffusion_processes.trainers.trainer][INFO] - Inception score -1.000000e+00

100

ghost commented 4 months ago

Thank you so much for the help !!

ghost commented 4 months ago

Hi I hope we can have a meeting for efficiency , since the result of exp by i run still incorrect,i.e. with larger N, should be have small fid, however as so far, my results still larger N larger fid. and i can not write these results in my work. ./cry /cry

best,

giulio98 commented 4 months ago

Hi from your logs it appear that it skips sampling because it has found checkpoints from previous run, please clean your logs directory, rm your and run again.

ghost commented 4 months ago

i think it should be here ?

image

there should be logs_test not logs ?

ghost commented 4 months ago

(lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$ pip install gdown Collecting gdown Using cached gdown-5.1.0-py3-none-any.whl.metadata (5.7 kB) Collecting beautifulsoup4 (from gdown) Using cached beautifulsoup4-4.12.3-py3-none-any.whl.metadata (3.8 kB) Requirement already satisfied: filelock in /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages (from gdown) (3.14.0) Requirement already satisfied: requests[socks] in /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages (from gdown) (2.31.0) Requirement already satisfied: tqdm in /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages (from gdown) (4.66.4) Collecting soupsieve>1.2 (from beautifulsoup4->gdown) Using cached soupsieve-2.5-py3-none-any.whl.metadata (4.7 kB) Requirement already satisfied: charset-normalizer<4,>=2 in /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages (from requests[socks]->gdown) (2.0.4) Requirement already satisfied: idna<4,>=2.5 in /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages (from requests[socks]->gdown) (3.7) Requirement already satisfied: urllib3<3,>=1.21.1 in /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages (from requests[socks]->gdown) (2.1.0) Requirement already satisfied: certifi>=2017.4.17 in /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages (from requests[socks]->gdown) (2024.2.2) Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages (from requests[socks]->gdown) (1.7.1) Using cached gdown-5.1.0-py3-none-any.whl (17 kB) Using cached beautifulsoup4-4.12.3-py3-none-any.whl (147 kB) Using cached soupsieve-2.5-py3-none-any.whl (36 kB) DEPRECATION: pytorch-lightning 1.7.7 has a non-standard dependency specifier torch>=1.9.*. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of pytorch-lightning or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063 Installing collected packages: soupsieve, beautifulsoup4, gdown Successfully installed beautifulsoup4-4.12.3 gdown-5.1.0 soupsieve-2.5 (lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$ gdown --id 1R9aRsV7q4yU0ey47tR7hFvKttEilUv0i /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/gdown/main.py:132: FutureWarning: Option --id was deprecated in version 4.3.1 and will be removed in 5.0. You don't need to pass it anymore to use a file ID. warnings.warn( Downloading... From (original): https://drive.google.com/uc?id=1R9aRsV7q4yU0ey47tR7hFvKttEilUv0i From (redirected): https://drive.google.com/uc?id=1R9aRsV7q4yU0ey47tR7hFvKttEilUv0i&confirm=t&uuid=638d3d3c-489c-4553-9baa-507f43f3be6f To: /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test.zip 100%|████████████████████████████████████████████████████████████████████████████████████████████| 794M/794M [00:15<00:00, 50.6MB/s] (lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$ unzip logs.zip unzip: cannot find or open logs.zip, logs.zip.zip or logs.zip.ZIP. (lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$ rm logs.zip rm: cannot remove 'logs.zip': No such file or directory (lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$ ls LICENSE assets data env.yaml mkdocs.yml pyproject.toml setup.cfg src README.md conf docs logs_test.zip models scripts setup.py (lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$ unzip logs_test.zip Archive: logs_test.zip inflating: logs_test/uvit_celeba/checkpoints/checkpoint_50
inflating: logs_test/uvit_celeba/checkpoints/checkpoint_37
inflating: logs_test/inr_celeba/checkpoints/checkpoint_9
inflating: logs_test/inr_mnist/checkpoints/checkpoint_27
inflating: logs_test/uvit_celeba/readme.docx
inflating: logs_test/inr_celeba/readme.docx
inflating: logs_test/inr_mnist/readme.docx
(lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$

ghost commented 4 months ago

so i need to replace all logs in .env to logs_test right ?

ghost commented 4 months ago
image image
ghost commented 4 months ago

and could you please tell me how did you find that i skip the sampling process from my log ?

ghost commented 4 months ago

i want to cry again. finally,

image

life is so hard

giulio98 commented 4 months ago

I found out because i saw from logs the time for sampling a batch was too low, so it means you had already samples in that folder.

Happy to see you manage to get the samples.

ghost commented 4 months ago

N= 50

image

image

image

thank you so much~ however the new problem is comming, the first image is when i run eval_minist with N = 50, the fid score is 1.022488e+02, however when i set N = 3 , the fid score is 7.572990e+01, less than N=50, which means 3 steps is better than 50 steps. this is counterintuitive

giulio98 commented 4 months ago

Please notice that you will get an higher fid score because the checkpoint we provided doesn't have the y-corrupted at the input of the INR.

giulio98 commented 4 months ago

I get fid score 1.28 using our config

giulio98 commented 4 months ago

Please rm your mnist_stats since could be broken before the changes i made

ghost commented 4 months ago

Emmm, I am a little confused about this: Please note that you will get a higher FID score because the checkpoint we provided does not include y-corrupted data at the input of the INR.

Also, since I've set up a new environment and performed a new Git clone, I did all operations from the beginning. Could this issue be because I did not run the training script 'sh scripts/maml/train_mnist.sh'?"

ghost commented 4 months ago

i see, but this time i totally create all by the beging , what ever environment and git repo " Please rm your mnist_stats since could be broken before the changes i made"

image

(lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$

ghost commented 4 months ago

i still think is the problem of when i caculate the FID . not the sample probllem

image
giulio98 commented 4 months ago

Please recalculate the mnist_stats.npz And reload the dataset rm -rf ./data

You should get FID 1.28

ghost commented 4 months ago

hi , it is still \cry image (lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$ sh scripts/maml/eval_mnist.sh 2024-05-08 12:21:25.670212: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. [2024-05-08 12:21:31,422][HYDRA] Launching 1 jobs locally [2024-05-08 12:21:31,422][HYDRA] #0 : +experiments_maml=eval_mnist [2024-05-08 12:21:31,574][main][INFO] - Instantiating [2024-05-08 12:21:32,447][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead. [2024-05-08 12:21:32,448][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead. [2024-05-08 12:21:32,462][absl][INFO] - Load dataset info from /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1 [2024-05-08 12:21:32,464][main][INFO] - Instantiating [2024-05-08 12:21:32,471][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead. [2024-05-08 12:21:32,475][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead. [2024-05-08 12:21:32,476][absl][INFO] - Load dataset info from /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1 [2024-05-08 12:21:32,483][main][INFO] - Instantiating [2024-05-08 12:21:32,594][main][INFO] - Instantiating [2024-05-08 12:21:32,612][absl][INFO] - Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: [2024-05-08 12:21:32,729][absl][INFO] - Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Interpreter CUDA Host [2024-05-08 12:21:32,730][absl][INFO] - Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client' [2024-05-08 12:21:32,736][absl][INFO] - Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this. [2024-05-08 12:21:33,412][main][INFO] - Instantiating [2024-05-08 12:21:33,432][main][INFO] - Instantiating [2024-05-08 12:21:33,436][main][INFO] - Instantiating [2024-05-08 12:21:33,440][main][INFO] - Instantiating [2024-05-08 12:21:33,450][main][INFO] - Instantiating [2024-05-08 12:21:33,508][absl][WARNING] - GlobalAsyncCheckpointManager is not imported correctly. Checkpointing of GlobalDeviceArrays will not be available.To use the feature, install tensorstore. WARNING:tensorflow:From /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-08 12:21:35,639][tensorflow][WARNING] - From /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-08 12:21:35,664][main][INFO] - Instantiating [2024-05-08 12:21:35,759][main][INFO] - Starting testing! wandb: Currently logged in as: zxcvzxcv980234. Use wandb login --relogin to force relogin wandb: wandb version 0.17.0 is available! To upgrade, please run: wandb: $ pip install wandb --upgrade wandb: Tracking run with wandb version 0.14.0 wandb: Run data is saved locally in /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/wandb/run-20240508_122135-lk25x14t wandb: Run wandb offline to turn off syncing. wandb: Syncing run inr_mnist wandb: ⭐️ View project at https://wandb.ai/zxcvzxcv980234/final wandb: 🚀 View run at https://wandb.ai/zxcvzxcv980234/final/runs/lk25x14t [2024-05-08 12:21:47,063][functional_diffusion_processes.trainers.trainer][INFO] - Total number of parameters: 0.12M [2024-05-08 12:21:47,314][functional_diffusion_processes.trainers.trainer][WARNING] - Resuming training from the latest checkpoint. [2024-05-08 12:21:47,315][absl][INFO] - Restoring checkpoint from /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/checkpoints/checkpoint_27 /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/jax/_src/lib/xla_bridge.py:544: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code. warnings.warn( [2024-05-08 12:21:47,399][absl][INFO] - Found no checkpoint files in /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist with prefix meta0 [2024-05-08 12:21:47,400][functional_diffusion_processes.trainers.trainer][INFO] - Starting sampling loop at step 0. 0%| | 0/32 [00:00<?, ?it/s][2024-05-08 12:21:47,407][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 0 /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/tensorflow/python/util/nest.py:917: UserWarning: tf.layers.flatten is deprecated and will be removed in a future version. Please use tf.keras.layers.Flatten instead. structure[0], [func(*x) for x in entries], /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/keras/legacy_tf_layers/base.py:627: UserWarning: layer.updates will be removed in a future version. This property should not be used in TensorFlow 2.0, as updates are applied automatically. self.updates, tf.compat.v1.GraphKeys.UPDATE_OPS [2024-05-08 12:23:06,841][absl][INFO] - Saving checkpoint at step: 0 [2024-05-08 12:23:06,869][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_0 3%|███ | 1/32 [01:19<41:03, 79.46s/it][2024-05-08 12:23:06,871][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 1 [2024-05-08 12:23:19,436][absl][INFO] - Saving checkpoint at step: 1 [2024-05-08 12:23:19,461][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_1 [2024-05-08 12:23:19,465][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_0 6%|██████ | 2/32 [01:32<20:04, 40.13s/it][2024-05-08 12:23:19,475][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 2 [2024-05-08 12:23:32,672][absl][INFO] - Saving checkpoint at step: 2 [2024-05-08 12:23:32,695][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_2 [2024-05-08 12:23:32,698][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_1 9%|█████████ | 3/32 [01:45<13:27, 27.85s/it][2024-05-08 12:23:32,707][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 3 [2024-05-08 12:23:44,812][absl][INFO] - Saving checkpoint at step: 3 [2024-05-08 12:23:44,842][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_3 [2024-05-08 12:23:44,844][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_2 12%|████████████ | 4/32 [01:57<10:06, 21.65s/it][2024-05-08 12:23:44,854][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 4 [2024-05-08 12:23:57,303][absl][INFO] - Saving checkpoint at step: 4 [2024-05-08 12:23:57,325][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_4 [2024-05-08 12:23:57,326][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_3 16%|███████████████ | 5/32 [02:09<08:15, 18.34s/it][2024-05-08 12:23:57,336][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 5 [2024-05-08 12:24:09,414][absl][INFO] - Saving checkpoint at step: 5 [2024-05-08 12:24:09,437][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_5 [2024-05-08 12:24:09,439][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_4 19%|██████████████████ | 6/32 [02:22<07:01, 16.23s/it][2024-05-08 12:24:09,449][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 6 [2024-05-08 12:24:22,034][absl][INFO] - Saving checkpoint at step: 6 [2024-05-08 12:24:22,057][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_6 [2024-05-08 12:24:22,058][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_5 22%|█████████████████████ | 7/32 [02:34<06:16, 15.05s/it][2024-05-08 12:24:22,069][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 7 [2024-05-08 12:24:34,772][absl][INFO] - Saving checkpoint at step: 7 [2024-05-08 12:24:34,802][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_7 [2024-05-08 12:24:34,803][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_6 25%|████████████████████████ | 8/32 [02:47<05:43, 14.31s/it][2024-05-08 12:24:34,813][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 8 [2024-05-08 12:24:47,208][absl][INFO] - Saving checkpoint at step: 8 [2024-05-08 12:24:47,230][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_8 [2024-05-08 12:24:47,231][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_7 28%|███████████████████████████ | 9/32 [02:59<05:15, 13.73s/it][2024-05-08 12:24:47,244][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 9 [2024-05-08 12:24:59,672][absl][INFO] - Saving checkpoint at step: 9 [2024-05-08 12:24:59,699][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_9 [2024-05-08 12:24:59,700][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_8 31%|█████████████████████████████▋ | 10/32 [03:12<04:53, 13.34s/it][2024-05-08 12:24:59,712][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 10 [2024-05-08 12:25:11,951][absl][INFO] - Saving checkpoint at step: 10 [2024-05-08 12:25:11,976][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_10 [2024-05-08 12:25:11,978][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_9 34%|████████████████████████████████▋ | 11/32 [03:24<04:33, 13.01s/it][2024-05-08 12:25:11,986][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 11 [2024-05-08 12:25:24,033][absl][INFO] - Saving checkpoint at step: 11 [2024-05-08 12:25:24,055][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_11 [2024-05-08 12:25:24,056][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_10 38%|███████████████████████████████████▋ | 12/32 [03:36<04:14, 12.73s/it][2024-05-08 12:25:24,068][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 12 [2024-05-08 12:25:36,608][absl][INFO] - Saving checkpoint at step: 12 [2024-05-08 12:25:36,641][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_12 [2024-05-08 12:25:36,643][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_11 41%|██████████████████████████████████████▌ | 13/32 [03:49<04:01, 12.69s/it][2024-05-08 12:25:36,653][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 13 [2024-05-08 12:25:48,740][absl][INFO] - Saving checkpoint at step: 13 [2024-05-08 12:25:48,763][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_13 [2024-05-08 12:25:48,764][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_12 44%|█████████████████████████████████████████▌ | 14/32 [04:01<03:45, 12.51s/it][2024-05-08 12:25:48,774][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 14 [2024-05-08 12:26:00,879][absl][INFO] - Saving checkpoint at step: 14 [2024-05-08 12:26:00,908][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_14 [2024-05-08 12:26:00,909][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_13 47%|████████████████████████████████████████████▌ | 15/32 [04:13<03:30, 12.40s/it][2024-05-08 12:26:00,919][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 15 [2024-05-08 12:26:13,237][absl][INFO] - Saving checkpoint at step: 15 [2024-05-08 12:26:13,259][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_15 [2024-05-08 12:26:13,261][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_14 50%|███████████████████████████████████████████████▌ | 16/32 [04:25<03:18, 12.39s/it][2024-05-08 12:26:13,271][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 16 [2024-05-08 12:26:25,451][absl][INFO] - Saving checkpoint at step: 16 [2024-05-08 12:26:25,472][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_16 [2024-05-08 12:26:25,474][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_15 53%|██████████████████████████████████████████████████▍ | 17/32 [04:38<03:05, 12.33s/it][2024-05-08 12:26:25,483][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 17 [2024-05-08 12:26:37,780][absl][INFO] - Saving checkpoint at step: 17 [2024-05-08 12:26:37,806][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_17 [2024-05-08 12:26:37,808][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_16 56%|█████████████████████████████████████████████████████▍ | 18/32 [04:50<02:52, 12.34s/it][2024-05-08 12:26:37,818][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 18 [2024-05-08 12:26:50,356][absl][INFO] - Saving checkpoint at step: 18 [2024-05-08 12:26:50,377][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_18 [2024-05-08 12:26:50,379][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_17 59%|████████████████████████████████████████████████████████▍ | 19/32 [05:02<02:41, 12.41s/it][2024-05-08 12:26:50,389][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 19 [2024-05-08 12:27:02,621][absl][INFO] - Saving checkpoint at step: 19 [2024-05-08 12:27:02,643][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_19 [2024-05-08 12:27:02,645][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_18 62%|███████████████████████████████████████████████████████████▍ | 20/32 [05:15<02:28, 12.36s/it][2024-05-08 12:27:02,654][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 20 [2024-05-08 12:27:14,775][absl][INFO] - Saving checkpoint at step: 20 [2024-05-08 12:27:14,809][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_20 [2024-05-08 12:27:14,816][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_19 66%|██████████████████████████████████████████████████████████████▎ | 21/32 [05:27<02:15, 12.30s/it][2024-05-08 12:27:14,820][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 21 [2024-05-08 12:27:26,709][absl][INFO] - Saving checkpoint at step: 21 [2024-05-08 12:27:26,738][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_21 [2024-05-08 12:27:26,740][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_20 69%|█████████████████████████████████████████████████████████████████▎ | 22/32 [05:39<02:01, 12.19s/it][2024-05-08 12:27:26,748][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 22 [2024-05-08 12:27:38,806][absl][INFO] - Saving checkpoint at step: 22 [2024-05-08 12:27:38,829][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_22 [2024-05-08 12:27:38,831][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_21 72%|████████████████████████████████████████████████████████████████████▎ | 23/32 [05:51<01:49, 12.16s/it][2024-05-08 12:27:38,840][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 23 [2024-05-08 12:27:51,517][absl][INFO] - Saving checkpoint at step: 23 [2024-05-08 12:27:51,539][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_23 [2024-05-08 12:27:51,540][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_22 75%|███████████████████████████████████████████████████████████████████████▎ | 24/32 [06:04<01:38, 12.33s/it][2024-05-08 12:27:51,552][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 24 [2024-05-08 12:28:03,336][absl][INFO] - Saving checkpoint at step: 24 [2024-05-08 12:28:03,382][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_24 [2024-05-08 12:28:03,384][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_23 78%|██████████████████████████████████████████████████████████████████████████▏ | 25/32 [06:15<01:25, 12.18s/it][2024-05-08 12:28:03,394][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 25 [2024-05-08 12:28:15,250][absl][INFO] - Saving checkpoint at step: 25 [2024-05-08 12:28:15,272][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_25 [2024-05-08 12:28:15,274][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_24 81%|█████████████████████████████████████████████████████████████████████████████▏ | 26/32 [06:27<01:12, 12.09s/it][2024-05-08 12:28:15,283][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 26 [2024-05-08 12:28:27,419][absl][INFO] - Saving checkpoint at step: 26 [2024-05-08 12:28:27,440][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_26 [2024-05-08 12:28:27,442][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_25 84%|████████████████████████████████████████████████████████████████████████████████▏ | 27/32 [06:40<01:00, 12.12s/it][2024-05-08 12:28:27,451][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 27 [2024-05-08 12:28:39,309][absl][INFO] - Saving checkpoint at step: 27 [2024-05-08 12:28:39,340][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_27 [2024-05-08 12:28:39,342][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_26 88%|███████████████████████████████████████████████████████████████████████████████████▏ | 28/32 [06:51<00:48, 12.05s/it][2024-05-08 12:28:39,351][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 28 [2024-05-08 12:28:51,955][absl][INFO] - Saving checkpoint at step: 28 [2024-05-08 12:28:51,977][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_28 [2024-05-08 12:28:51,980][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_27 91%|██████████████████████████████████████████████████████████████████████████████████████ | 29/32 [07:04<00:36, 12.23s/it][2024-05-08 12:28:51,989][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 29 [2024-05-08 12:29:04,399][absl][INFO] - Saving checkpoint at step: 29 [2024-05-08 12:29:04,421][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_29 [2024-05-08 12:29:04,423][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_28 94%|█████████████████████████████████████████████████████████████████████████████████████████ | 30/32 [07:17<00:24, 12.30s/it][2024-05-08 12:29:04,458][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 30 [2024-05-08 12:29:16,601][absl][INFO] - Saving checkpoint at step: 30 [2024-05-08 12:29:16,623][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_30 [2024-05-08 12:29:16,625][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_29 97%|████████████████████████████████████████████████████████████████████████████████████████████ | 31/32 [07:29<00:12, 12.26s/it][2024-05-08 12:29:16,636][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 31 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [07:41<00:00, 12.33s/it][2024-05-08 12:29:29,178][functional_diffusion_processes.trainers.trainer][INFO] - FID: 7.572990e+01 [2024-05-08 12:29:29,179][functional_diffusion_processes.trainers.trainer][INFO] - Inception score -1.000000e+00 wandb: Waiting for W&B process to finish... (success). wandb: \ 39.926 MB of 39.966 MB uploaded (0.000 MB deduped) wandb: Run history: wandb: FID ▁ wandb: inception score ▁ wandb: wandb: Run summary: wandb: FID 75.7299 wandb: inception score -1.0 wandb: wandb: 🚀 View run inr_mnist at: https://wandb.ai/zxcvzxcv980234/final/runs/lk25x14t wandb: Synced 6 W&B file(s), 32 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20240508_122135-lk25x14t/logs 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [07:45<00:00, 14.56s/it] (lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$

giulio98 commented 4 months ago

In your logs you don't have the steps where it calculates the mnist_stats.npz meaning that is reusing one already precalculated, did you remove it?

giulio98 commented 4 months ago

Please run this rm -rf ./data It will delete the mnist dataset and the stats then rerun the script

ghost commented 4 months ago

yes, removed it all .npz file when i run a new fid. i think maybe the problem in this part code since i remeber last time you fix about the jax .... , will it affect there ? class FIDMetric: """Class for computing the Frechet Inception Distance (FID) metric.

This class facilitates the computation of the FID metric, which measures the similarity between two distributions of images.
It precomputes features for the real dataset using a specified Inception feature extractor and provides methods to compute
and store features for generated images, and to compute the FID and Inception Score (IS).

Attributes:
    metric_config (DictConfig): Configuration parameters for the FID metric.
    feature_extractor (InceptionFeatureExtractor): Inception feature extractor for computing the FID metric.
    dataset (BaseDataset): Dataset object providing real samples for FID computation.
    generated_pools (list): List to store features of generated images.
    generated_logits (list): List to store logits of generated images.
    real_features (dict): Dictionary to store precomputed features of real dataset.
"""

def __init__(
    self,
    metric_config: DictConfig,
    feature_extractor: InceptionFeatureExtractor,
    dataset: BaseDataset,
) -> None:
    """Initializes the FIDMetric class with specified configurations, feature extractor, and dataset.

    Args:
        metric_config (DictConfig): Configuration parameters for the FID metric.
        feature_extractor (InceptionFeatureExtractor): Inception feature extractor for computing the FID metric.
        dataset (BaseDataset): Dataset object providing real samples for FID computation.
    """
    self.metric_config = metric_config
    self.feature_extractor = feature_extractor
    self.dataset = dataset
    self.generated_pools = []
    self.generated_logits = []
    try:
        self.real_features = load_dataset_stats(
            save_path=metric_config.real_features_path,
            dataset_name=metric_config.dataset_name,
        )
    except FileNotFoundError:
        self._precompute_features(
            dataset_name=metric_config.dataset_name,
            save_path=metric_config.real_features_path,
        )
        self.real_features = load_dataset_stats(
            save_path=metric_config.real_features_path,
            dataset_name=metric_config.dataset_name,
        )

def _precompute_features(self, dataset_name: str, save_path: str) -> None:
    """Precomputes and saves features for the real dataset.

    Args:
        dataset_name (str): Name of the dataset.
        save_path (str): Path where the computed features will be saved.
    """
    tf.io.gfile.makedirs(path=save_path)

    tf.io.gfile.makedirs(os.path.join(save_path, f"{dataset_name.lower()}_clean"))

    # Use the feature extractor to compute features for the real dataset
    all_pools = self.feature_extractor.extract_features(
        dataset=self.dataset, save_path=save_path, dataset_name=dataset_name
    )

    # Save latent represents of the Inception network to disk or Google Cloud Storage
    filename = f"{dataset_name.lower()}_stats.npz"

    if jax.host_id() == 0:
        pylogger.info("Saving real dataset stats to: %s" % os.path.join(save_path, filename))

    with tf.io.gfile.GFile(os.path.join(save_path, filename), "wb") as f_out:
        io_buffer = io.BytesIO()
        np.savez_compressed(io_buffer, pool_3=all_pools)
        f_out.write(io_buffer.getvalue())

def compute_fid(self, eval_dir, num_sampling_round) -> Tuple[float, float]:
    """Computes the FID and Inception Score (IS) for the generated and real images.

    Args:
        eval_dir (str): Directory path for evaluation.
        num_sampling_round (int): Number of sampling rounds.

    Returns:
        Tuple[float, float]: A tuple containing the FID and Inception Score.
    """
    real_pools = self.real_features["pool_3"]
    if not self.feature_extractor.inception_v3 and not self.feature_extractor.inception_v3 == "lenet":
        if len(self.generated_logits) == 0 or len(self.generated_pools) == 0:
            if jax.host_id() == 0:
                # Load all statistics that have been previously computed and saved for each host
                for host in range(jax.host_count()):
                    stats = tf.io.gfile.glob(os.path.join(eval_dir, "statistics_*.npz"))
                    wait_message = False
                    while len(stats) < num_sampling_round:
                        if not wait_message:
                            print("Waiting for statistics on host %d" % (host,))
                            wait_message = True
                        stats = tf.io.gfile.glob(os.path.join(eval_dir, "statistics_*.npz"))
                        time.sleep(10)

                    for stat_file in stats:
                        with tf.io.gfile.GFile(stat_file, "rb") as fin:
                            stat = np.load(fin)

                            self.generated_pools.append(stat["pool_3"])
                            self.generated_logits.append(stat["logits"])

        all_logits = np.concatenate(self.generated_logits, axis=0)[: self.metric_config.num_samples]
        inception_score = tfgan.eval.classifier_score_from_logits(logits=all_logits)
    else:
        inception_score = -1

    all_pools = np.concatenate(self.generated_pools, axis=0)[: self.metric_config.num_samples]

    fid = tfgan.eval.frechet_classifier_distance_from_activations(activations1=real_pools, activations2=all_pools)

    return fid, inception_score
ghost commented 4 months ago
image image
ghost commented 4 months ago

i will run this Please run this rm -rf ./data It will delete the mnist dataset and the stats then rerun the script thank you !

ghost commented 4 months ago

I think this time maybe on the way !! than you ~~