google-research / xmcgan_image_generation

98 stars 15 forks source link

Library version #24

Open lcxsnow opened 1 year ago

lcxsnow commented 1 year ago

When I installed the tensorflow==2.5.0rc0, it appeared the below message:

ERROR: Could not find a version that satisfies the requirement tensorflow==2.5.0rc0 (from versions: 2.2.0, 2.2.1, 2.2.2, 2.2.3, 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4, 2.5.0, 2.5.1, 2.5.2, 2.5.3, 2.6.0rc0, 2.6.0rc1, 2.6.0rc2, 2.6.0, 2.6.1, 2.6.2, 2.6.3, 2.6.4, 2.6.5, 2.7.0rc0, 2.7.0rc1, 2.7.0, 2.7.1, 2.7.2, 2.7.3, 2.7.4, 2.8.0rc0, 2.8.0rc1, 2.8.0, 2.8.1, 2.8.2, 2.8.3, 2.8.4, 2.9.0rc0, 2.9.0rc1, 2.9.0rc2, 2.9.0, 2.9.1, 2.9.2, 2.9.3, 2.10.0rc0, 2.10.0rc1, 2.10.0rc2, 2.10.0rc3, 2.10.0, 2.10.1, 2.11.0rc0, 2.11.0rc1, 2.11.0rc2, 2.11.0) ERROR: No matching distribution found for tensorflow==2.5.0rc0

How can I do?

lcxsnow commented 1 year ago

I installed tensorflow 2.5.0 instead of 2.5.0rc0. There are some error, when I run the "bash train.sh exp_name &> train.txt". What else should the python version and typing-extensions version should be?

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. optax 0.1.4 requires typing-extensions>=3.10.0, but you have typing-extensions 3.7.4.3 which is incompatible. chex 0.1.6 requires typing-extensions>=4.2.0; python_version < "3.11", but you have typing-extensions 3.7.4.3 which is incompatible.

Traceback (most recent call last): File "/opt/anaconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/opt/anaconda3/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/xmcgan/main.py", line 71, in app.run(main) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/absl/app.py", line 303, in run _run_main(main, args) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/absl/app.py", line 251, in _run_main sys.exit(main(argv)) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/xmcgan/main.py", line 63, in main train_utils.train(FLAGS.config, FLAGS.workdir) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/xmcgan/train_utils.py", line 383, in train state = flax_utils.replicate(state) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/flax/jax_utils.py", line 75, in replicate return jax.tree_map(lambda x: _replicate(x, devices), tree) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/jax/_src/tree_util.py", line 181, in tree_map return treedef.unflatten(f(xs) for xs in zip(all_leaves)) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/jax/_src/tree_util.py", line 181, in return treedef.unflatten(f(xs) for xs in zip(all_leaves)) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/flax/jax_utils.py", line 75, in return jax.tree_map(lambda x: _replicate(x, devices), tree) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/flax/jax_utils.py", line 57, in _replicate if hasattr(jax.api, "device_put_sharded"): # jax >= 0.2.0 AttributeError: module 'jax' has no attribute 'api'

I tried jax with version 0.4.3, 0.3.4 and 0.3.5. None of them work.

woctezuma commented 1 year ago

ERROR: No matching distribution found for tensorflow==2.5.0rc0

Indeed, there is no version 2.5.0rc0 at https://pypi.org/project/tensorflow/#history.

AttributeError: module 'jax' has no attribute 'api'

Maybe try jaxlib:

lcxsnow commented 1 year ago

woctezuma

Thanks for your reply. I fixed the problem with install the cuda version of jax and install the tensorflow 2.5.0. Need to comfirm "jax.devices()" get the device.

But I got the new error below.

I0214 00:19:29.416610 140332804085568 main.py:52] JAX host: 0 / 1 I0214 00:19:29.416661 140332804085568 main.py:53] JAX devices: [StreamExecutorGpuDevice(id=0, process_index=0, slice_index=0), StreamExecutorGpuDevice(id=1, process_index=0, slice_index=0)] I0214 00:19:29.416717 140332804085568 local.py:45] Setting task status: host_id: 0, host_count: 1 I0214 00:19:29.416844 140332804085568 local.py:50] Created artifact workdir of type ArtifactType.DIRECTORY and value data/exp/exp_name. 2023-02-14 00:19:29.882610: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.2.1 but source was compiled with: 8.6.0. CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration. 2023-02-14 00:19:29.883696: E external/org_tensorflow/tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (external/org_tensorflow/tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:626) dnn != nullptr Begin stack trace

Here is my nvidia-smi display: NVIDIA-SMI 470.161.03 Driver Version: 470.161.03 CUDA Version: 11.4

and nvcc -V: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2020 NVIDIA Corporation Built on Thu_Jun_11_22:26:38_PDT_2020 Cuda compilation tools, release 11.0, V11.0.194 Build cuda_11.0_bu.TC445_37.28540450_0

woctezuma commented 1 year ago

Loaded runtime CuDNN library: 8.2.1 but source was compiled with: 8.6.0. CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.

It seems to be a version mismatch.

lcxsnow commented 1 year ago

Loaded runtime CuDNN library: 8.2.1 but source was compiled with: 8.6.0. CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.

It seems to be a version mismatch.

I upgrade cuda to 11.4 and cudnn to 8.6 and it fixed. But there is a new error coming up.

/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/flax/optim/base.py:49: DeprecationWarning: Use optax instead of flax.optim. Refer to the update guide https://flax.readthedocs.io/en/latest/howtos/optax_update_guide.html for detailed instructions. warnings.warn( /home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/flax/optim/base.py:49: DeprecationWarning: Use optax instead of flax.optim. Refer to the update guide https://flax.readthedocs.io/en/latest/howtos/optax_update_guide.html for detailed instructions. warnings.warn( I0214 10:54:33.493715 139689696290624 utils.py:31] Checkpoint.restore_or_initialize() ... I0214 10:54:33.493807 139689696290624 checkpoint.py:301] No checkpoint specified. Restore the latest checkpoint. I0214 10:54:33.493842 139689696290624 utils.py:31] MultihostCheckpoint.get_latest_checkpoint_to_restore_from() ... I0214 10:54:33.494217 139689696290624 checkpoint.py:430] Checked checkpoint base_directories: ['data/exp/checkpoints-0'] - common_numbers=set() - exclusive_numbers=set() I0214 10:54:33.494263 139689696290624 utils.py:41] MultihostCheckpoint.get_latest_checkpoint_to_restore_from() finished after 0.00s. I0214 10:54:33.494293 139689696290624 checkpoint.py:304] Checkpoint None does not exist. I0214 10:54:33.494324 139689696290624 utils.py:31] Checkpoint.save() ... E0214 10:54:33.497232 139689696290624 utils.py:38] Checkpoint.save() FAILED after 0.00s with TypeError. Traceback (most recent call last): File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/clu/internal/utils.py", line 33, in log_activity yield File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/clu/internal/utils.py", line 51, in decorator return wrapped(args, kwargs) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/clu/checkpoint.py", line 265, in save f.write(flax.serialization.to_bytes(state)) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/flax/serialization.py", line 383, in to_bytes return msgpack_serialize(state_dict, in_place=True) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/flax/serialization.py", line 334, in msgpack_serialize return msgpack.packb(pytree, default=_msgpack_ext_pack, strict_types=True) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/msgpack/init.py", line 35, in packb return Packer(kwargs).pack(o) File "msgpack/_packer.pyx", line 292, in msgpack._cmsgpack.Packer.pack File "msgpack/_packer.pyx", line 298, in msgpack._cmsgpack.Packer.pack File "msgpack/_packer.pyx", line 295, in msgpack._cmsgpack.Packer.pack File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack [Previous line repeated 1 more time] File "msgpack/_packer.pyx", line 289, in msgpack._cmsgpack.Packer._pack TypeError: can not serialize 'Array' object E0214 10:54:33.497600 139689696290624 utils.py:38] Checkpoint.restore_or_initialize() FAILED after 0.00s with TypeError. Traceback (most recent call last): File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/clu/internal/utils.py", line 33, in log_activity yield File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/clu/internal/utils.py", line 51, in decorator return wrapped(args, kwargs) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/clu/checkpoint.py", line 305, in restore_or_initialize self.save(state) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/clu/internal/utils.py", line 51, in decorator return wrapped(args, kwargs) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/clu/checkpoint.py", line 265, in save f.write(flax.serialization.to_bytes(state)) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/flax/serialization.py", line 383, in to_bytes return msgpack_serialize(state_dict, in_place=True) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/flax/serialization.py", line 334, in msgpack_serialize return msgpack.packb(pytree, default=_msgpack_ext_pack, strict_types=True) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/msgpack/init.py", line 35, in packb return Packer(kwargs).pack(o) File "msgpack/_packer.pyx", line 292, in msgpack._cmsgpack.Packer.pack File "msgpack/_packer.pyx", line 298, in msgpack._cmsgpack.Packer.pack File "msgpack/_packer.pyx", line 295, in msgpack._cmsgpack.Packer.pack File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack [Previous line repeated 1 more time] File "msgpack/_packer.pyx", line 289, in msgpack._cmsgpack.Packer._pack TypeError: can not serialize 'Array' object Traceback (most recent call last): File "/opt/anaconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/opt/anaconda3/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/xmcgan/main.py", line 71, in app.run(main) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/absl/app.py", line 303, in run _run_main(main, args) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/absl/app.py", line 251, in _run_main sys.exit(main(argv)) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/xmcgan/main.py", line 63, in main train_utils.train(FLAGS.config, FLAGS.workdir) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/xmcgan/train_utils.py", line 375, in train state = ckpt.restore_or_initialize(state) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/clu/internal/utils.py", line 51, in decorator return wrapped(args, kwargs) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/clu/checkpoint.py", line 305, in restore_or_initialize self.save(state) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/clu/internal/utils.py", line 51, in decorator return wrapped(*args, kwargs) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/clu/checkpoint.py", line 265, in save f.write(flax.serialization.to_bytes(state)) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/flax/serialization.py", line 383, in to_bytes return msgpack_serialize(state_dict, in_place=True) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/flax/serialization.py", line 334, in msgpack_serialize return msgpack.packb(pytree, default=_msgpack_ext_pack, strict_types=True) File "/home/m11113013/ProjectCode/practice model/xmcgan_image_generation-main/venv/lib/python3.8/site-packages/msgpack/init.py", line 35, in packb return Packer(kwargs).pack(o) File "msgpack/_packer.pyx", line 292, in msgpack._cmsgpack.Packer.pack File "msgpack/_packer.pyx", line 298, in msgpack._cmsgpack.Packer.pack File "msgpack/_packer.pyx", line 295, in msgpack._cmsgpack.Packer.pack File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack [Previous line repeated 1 more time] File "msgpack/_packer.pyx", line 289, in msgpack._cmsgpack.Packer._pack TypeError: can not serialize 'Array' object

I tried to update flax from 0.5.1 to 0.6.1, but it appeared the below message: AttributeError: module 'flax' has no attribute 'optim'

the way to fix this is to downgrade flax to 0.5.1.

This make me confused.