zhou13 / lcnn

LCNN: End-to-End Wireframe Parsing
MIT License
494 stars 94 forks source link

CUDA is not available, Jump cpu run training. #65

Closed molyswu closed 1 year ago

molyswu commented 1 year ago

python ./train.py -d 0 --identifier baseline config/wireframe.yaml

{ 'io': { 'datadir': 'data/wireframe/', 'logdir': 'logs/', 'num_workers': 4, 'resume_from': None, 'tensorboard_port': 0, 'validation_interval': 24000}, 'model': { 'backbone': 'stacked_hourglass', 'batch_size': 6, 'batch_size_eval': 2, 'depth': 4, 'dim_fc': 1024, 'dim_loi': 128, 'eval_junc_thres': 0.008, 'head_size': <BoxList: [[2], [1], [2]]>, 'image': { 'mean': <BoxList: [109.73, 103.832, 98.681]>, 'stddev': <BoxList: [22.275, 22.124, 23.229]>}, 'loss_weight': { 'jmap': 8.0, 'joff': 0.25, 'lmap': 0.5, 'lneg': 1, 'lpos': 1}, 'n_dyn_junc': 300, 'n_dyn_negl': 80, 'n_dyn_othr': 600, 'n_dyn_posl': 300, 'n_out_junc': 250, 'n_out_line': 2500, 'n_pts0': 32, 'n_pts1': 8, 'n_stc_negl': 40, 'n_stc_posl': 300, 'num_blocks': 1, 'num_stacks': 2, 'use_conv': 0, 'use_cood': 0, 'use_slop': 0}, 'optim': { 'amsgrad': True, 'lr': 0.0004, 'lr_decay_epoch': 10, 'max_epoch': 24, 'name': 'Adam', 'weight_decay': 0.0001}} CUDA is not available ntrain: 20000 nvalid: 462 outdir: logs/221202-155830-4026c7b-baseline TensorFlow installation not found - running with reduced feature set.

NOTE: Using experimental fast data loading logic. To disable, pass "--load_fast=false" and report issues on GitHub. More details: https://github.com/tensorflow/tensorboard/issues/4784

E1202 15:58:30.970504 140001995388096 application.py:125] Failed to load plugin WhatIfToolPluginLoader.load; ignoring it. Traceback (most recent call last): File "/root/anaconda3/lib/python3.9/site-packages/tensorboard/backend/application.py", line 123, in TensorBoardWSGIApp plugin = loader.load(context) File "/root/anaconda3/lib/python3.9/site-packages/tensorboard_plugin_wit/wit_plugin_loader.py", line 57, in load from tensorboard_plugin_wit.wit_plugin import WhatIfToolPlugin File "/root/anaconda3/lib/python3.9/site-packages/tensorboard_plugin_wit/wit_plugin.py", line 40, in from tensorboard_plugin_wit._utils import common_utils File "/root/anaconda3/lib/python3.9/site-packages/tensorboard_plugin_wit/_utils/common_utils.py", line 17, in from tensorboard_plugin_wit._vendor.tensorflow_serving.apis import classification_pb2 File "/root/anaconda3/lib/python3.9/site-packages/tensorboard_plugin_wit/_vendor/tensorflow_serving/apis/classification_pb2.py", line 15, in from tensorboard_plugin_wit._vendor.tensorflow_serving.apis import input_pb2 as tensorflowserving_dot_apis_dot_inputpb2 File "/root/anaconda3/lib/python3.9/site-packages/tensorboard_plugin_wit/_vendor/tensorflow_serving/apis/input_pb2.py", line 15, in from tensorflow.core.example import example_pb2 as tensorflow_dot_core_dot_example_dot_examplepb2 ModuleNotFoundError: No module named 'tensorflow.core.example' Traceback (most recent call last): File "/root/anaconda3/bin/tensorboard", line 8, in sys.exit(run_main()) File "/root/anaconda3/lib/python3.9/site-packages/tensorboard/main.py", line 46, in run_main app.run(tensorboard.main, flags_parser=tensorboard.configure) File "/root/anaconda3/lib/python3.9/site-packages/absl/app.py", line 308, in run _run_main(main, args) File "/root/anaconda3/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main sys.exit(main(argv)) File "/root/anaconda3/lib/python3.9/site-packages/tensorboard/program.py", line 276, in main return runner(self.flags) or 0 File "/root/anaconda3/lib/python3.9/site-packages/tensorboard/program.py", line 292, in _run_serve_subcommand server = self._make_server() File "/root/anaconda3/lib/python3.9/site-packages/tensorboard/program.py", line 476, in _make_server app = application.TensorBoardWSGIApp( File "/root/anaconda3/lib/python3.9/site-packages/tensorboard/backend/application.py", line 139, in TensorBoardWSGIApp return TensorBoardWSGI( File "/root/anaconda3/lib/python3.9/site-packages/tensorboard/backend/application.py", line 252, in init raise ValueError( ValueError: Duplicate plugins for name projector /home/dell/wts_document/AOI/lcnn/lcnn/models/line_vectorizer.py:174: UserWarning: floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). y = (index // 128).float() + torch.gather(joff[:, 0], 1, index) + 0.5 /root/anaconda3/lib/python3.9/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /croot/pytorch_1669252628709/work/aten/src/ATen/native/TensorShape.cpp:2894.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] File "/root/anaconda3/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main sys.exit(main(argv)) File "/root/anaconda3/lib/pyt000/0000k| 4.02047| 2.36945| 0.23688| 0.12584| 0.000/0000k| 2.57562| 1.22070| 0.19944| 0.12632| 0.52799| 0.50118| 00.7

CUDA is not available Jump cpu run training,

zhou13 commented 1 year ago

Did you follow the instruction and installed the tensorboardx?

molyswu commented 1 year ago

I had installed the tensorboardx. Same question.

molyswu commented 1 year ago

CUDA is not available ntrain: 20000 nvalid: 462 outdir: logs/221203-095639-4026c7b-baseline TensorFlow installation not found - running with reduced feature set.

NOTE: Using experimental fast data loading logic. To disable, pass "--load_fast=false" and report issues on GitHub. More details: https://github.com/tensorflow/tensorboard/issues/4784

Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all TensorBoard 2.11.0 at http://localhost:36095/ (Press CTRL+C to quit) /root/anaconda3/envs/lcnn/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1666642922335/work/aten/src/ATen/native/TensorShape.cpp:3190.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

molyswu commented 1 year ago

024, 'n_out_junc': 250, 'n_out_line': 2500, 'use_cood': 0, 'use_slop': 0, 'use_conv': 0, 'eval_junc_thres': 0.008}, 'optim': {'name': 'Adam', 'lr': 0.0004, 'amsgrad': True, 'weight_decay': 0.0001, 'max_epoch': 24, 'lr_decay_epoch': 10}}> CUDA is not available ntrain: 20000 nvalid: 462 outdir: logs/221203-101724-4026c7b-baseline TensorFlow installation not found - running with reduced feature set.

NOTE: Using experimental fast data loading logic. To disable, pass "--load_fast=false" and report issues on GitHub. More details: https://github.com/tensorflow/tensorboard/issues/4784

Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all TensorBoard 2.11.0 at http://localhost:33429/ (Press CTRL+C to quit) /root/anaconda3/envs/lcnn/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1666642922335/work/aten/src/ATen/native/TensorShape.cpp:3190.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

molyswu commented 1 year ago

git clone https://github.com/zhou13/lcnn cd lcnn conda create -y -n lcnn source activate lcnn CUDA==11.3,CUDNN8.2.4, conda install -y pytorch cudatoolkit=11.3 -c pytorch conda install -y tensorboardx gdown -c conda-forge conda install -y pyyaml docopt matplotlib scikit-image opencv mkdir data logs post

zhou13 commented 1 year ago

Since this is likely because your machine doesn't have CUDA or other config issues, I will close this issue. If you have updates related to the improvement of this project, don't hesitate to reopen it.