tsc2017 / Frechet-Inception-Distance

CPU/GPU/TPU implementation of the Fréchet Inception Distance
79 stars 14 forks source link

Error in running fid.py #7

Open wcy-cs opened 4 years ago

wcy-cs commented 4 years ago

I run the fid,py, but the error accurs:

``ssh://wangchy@192.168.113.66:22/home/wangchy/anaconda3/bin/python3 -u /home/wangchy/.pycharm_helpers/pydev/pydevd.py --multiproc --qt-support=auto --client 0.0.0.0 --port 38755 --file /home/wangchy/wcy/FAGFSR/metrics/fid.py
pydev debugger: process 65872 is connecting

Connected to pydev debugger (build 193.5662.61)
pydev debugger: warning: trying to add breakpoint to file that does not exist: /home/wangchy/wcy/FAGFSR/model/fishsrnet.py (will have no effect)
2020-10-20 20:23:46.970816: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-10-20 20:23:46.970864: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
WARNING:tensorflow:From /home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:tensorflow:From /home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

2020-10-20 20:23:55.176133: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2394330000 Hz
2020-10-20 20:23:55.180586: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55dcd0e64260 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-10-20 20:23:55.180654: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-10-20 20:23:55.184841: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-10-20 20:23:55.910091: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-10-20 20:23:56.005441: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-10-20 20:23:56.007801: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55dcd0f38cd0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-10-20 20:23:56.007864: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1080 Ti, Compute Capability 6.1
2020-10-20 20:23:56.007880: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (1): GeForce GTX 1080 Ti, Compute Capability 6.1
2020-10-20 20:23:56.007899: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (2): GeForce GTX 1080 Ti, Compute Capability 6.1
2020-10-20 20:23:56.007927: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (3): GeForce GTX 1080 Ti, Compute Capability 6.1
2020-10-20 20:23:56.014048: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:02:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.62GHz coreCount: 28 deviceMemorySize: 10.91GiB deviceMemoryBandwidth: 451.17GiB/s
2020-10-20 20:23:56.016255: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 1 with properties: 
pciBusID: 0000:03:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.62GHz coreCount: 28 deviceMemorySize: 10.92GiB deviceMemoryBandwidth: 451.17GiB/s
2020-10-20 20:23:56.016421: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-10-20 20:23:56.018552: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 2 with properties: 
pciBusID: 0000:83:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.62GHz coreCount: 28 deviceMemorySize: 10.92GiB deviceMemoryBandwidth: 451.17GiB/s
2020-10-20 20:23:56.018714: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-10-20 20:23:56.020875: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 3 with properties: 
pciBusID: 0000:84:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.62GHz coreCount: 28 deviceMemorySize: 10.92GiB deviceMemoryBandwidth: 451.17GiB/s
2020-10-20 20:23:56.021192: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-10-20 20:23:56.021402: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory
2020-10-20 20:23:56.021620: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory
2020-10-20 20:23:56.021832: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcurand.so.10'; dlerror: libcurand.so.10: cannot open shared object file: No such file or directory
2020-10-20 20:23:56.022043: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory
2020-10-20 20:23:56.022256: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcusparse.so.10'; dlerror: libcusparse.so.10: cannot open shared object file: No such file or directory
2020-10-20 20:23:56.028004: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-10-20 20:23:56.028032: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-10-20 20:23:56.028223: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-10-20 20:23:56.028244: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0 1 2 3 
2020-10-20 20:23:56.028255: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N Y N N 
2020-10-20 20:23:56.028262: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 1:   Y N N N 
2020-10-20 20:23:56.028277: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 2:   N N N Y 
2020-10-20 20:23:56.028295: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 3:   N N Y N 
############################
over
1
2
Traceback (most recent call last):
  File "/home/wangchy/anaconda3/lib/python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 4200, in name_scope
    yield "" if new_stack is None else new_stack + "/"
  File "/home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow/python/ops/map_fn.py", line 499, in map_fn
    maximum_iterations=n)
  File "/home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2774, in while_loop
    return_same_structure)
  File "/home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2256, in BuildLoop
    pred, body, original_loop_vars, loop_vars, shape_invariants)
  File "/home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2181, in _BuildLoop
    body_result = body(*packed_vars_for_body)
  File "/home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2726, in <lambda>
    body = lambda i, lv: (i + 1, orig_body(*lv))
  File "/home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow/python/ops/map_fn.py", line 483, in compute
    result_value = autographed_fn(elems_value)
  File "/home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow/python/autograph/impl/api.py", line 258, in wrapper
    raise e.ag_error_metadata.to_exception(e)
tensorflow.python.autograph.impl.api.StagingError: in user code:

    /home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow_gan/python/eval/inception_metrics.py:94 _classifier_fn  *
        output = tfhub.load(tfhub_module)(images)
    /home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow_hub/module_v2.py:101 load  *
        module_path = resolve(handle)
    /home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow_hub/module_v2.py:53 resolve  *
        return registry.resolver(handle)
    /home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow_hub/registry.py:44 __call__  *
        return impl(*args, **kwargs)
    /home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow_hub/compressed_module_resolver.py:83 download  *
        response = self._call_urlopen(request)
    /home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow_hub/resolver.py:418 atomic_download  *
        download_fn(handle, tmp_dir)
    /home/wangchy/anaconda3/lib/python3.7/site-packages/tensorflow_hub/compressed_module_resolver.py:96 _call_urlopen  *
        return url.urlopen(request)
    /home/wangchy/anaconda3/lib/python3.7/urllib/request.py:222 urlopen  **
        return opener.open(url, data, timeout)
    /home/wangchy/anaconda3/lib/python3.7/urllib/request.py:525 open
        response = self._open(req, data)
    /home/wangchy/anaconda3/lib/python3.7/urllib/request.py:543 _open
        '_open', req)
    /home/wangchy/anaconda3/lib/python3.7/urllib/request.py:503 _call_chain
        result = func(*args)
    /home/wangchy/anaconda3/lib/python3.7/urllib/request.py:1360 https_open
        context=self._context, check_hostname=self._check_hostname)
    /home/wangchy/anaconda3/lib/python3.7/urllib/request.py:1319 do_open
        raise URLError(err)

    URLError: <urlopen error [Errno 110] Connection timed out>

Process finished with exit code 1

What can i do to address this problem? Thank you very much.

dipjyoti92 commented 3 years ago

@Wcy1169589564 It seems that TensorFlow cannot access your GPU ("Skipping registering GPU devices..."). Make sure that it works.