Closed nizamibilal closed 9 months ago
Hi,
many thanks for your feedback and welcome to the community!
I see that you are using 4.0.35 and we should have fixed this issue in the latest release. So you would need to update to 4.1. Note that you may have to pip install xxhash
in your environment. Unfortunately, there is no automatic way to get new Python packages installed.
Let me know should there be further issues, Hannes.
Thank you! It resolved the issue.
Hi, I am trying to running reinforcement learning example using the provided toml config file.
Example:
reinvent -l transfer_learning.log transfer_learning.toml
Ouput:
2024-02-27 16:50:50.292312: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable
sys.exit(main())
File "/home/bilal.nizami/anaconda3/envs/reinvent4/lib/python3.10/site-packages/reinvent/Reinvent.py", line 292, in main
runner(input_config, actual_device, tb_logdir, responder_config)
File "/home/bilal.nizami/anaconda3/envs/reinvent4/lib/python3.10/site-packages/reinvent/runmodes/TL/run_transfer_learning.py", line 144, in run_transfer_learning
runner = runner_class(adapter, tb_logdir, mode_config, logger_parameters)
File "/home/bilal.nizami/anaconda3/envs/reinvent4/lib/python3.10/site-packages/reinvent/runmodes/TL/learning.py", line 138, in init
self.tb_reporter.add_histogram("Tanimoto input SMILES", np.array(sim), 0)
File "/home/bilal.nizami/anaconda3/envs/reinvent4/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py", line 484, in add_histogram
histogram(tag, values, bins, max_bins=max_bins), global_step, walltime
File "/home/bilal.nizami/anaconda3/envs/reinvent4/lib/python3.10/site-packages/torch/utils/tensorboard/summary.py", line 352, in histogram
hist = make_histogram(values.astype(float), bins, max_bins)
File "/home/bilal.nizami/anaconda3/envs/reinvent4/lib/python3.10/site-packages/torch/utils/tensorboard/summary.py", line 380, in make_histogram
cum_counts = np.cumsum(np.greater(counts, 0, dtype=np.int32))
TypeError: No loop matching the specified signature and casting was found for ufunc greater
TF_ENABLE_ONEDNN_OPTS=0
. 2024-02-27 16:50:50.322270: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-02-27 16:50:50.322315: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-02-27 16:50:50.323226: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-02-27 16:50:50.328169: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 AVX_VNNI AMX_TILE AMX_INT8 AMX_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. Traceback (most recent call last): File "/home/bilal.nizami/anaconda3/envs/reinvent4/bin/reinvent", line 8, inoutput of log file:
16:50:41 Started REINVENT 4.0.35 (C) AstraZeneca 2017, 2023 on 2024-02-27
16:50:41 Command line: /home/bilal.nizami/anaconda3/envs/reinvent4/bin/reinvent -l transfer_learning.log transfer_learning.toml
16:50:41 User bilal.nizamimoa-technology.com on host mtdev-bilal
16:50:41 Python version 3.10.13
16:50:41 PyTorch version 1.12.1+cu113, git 664058fa83f1d8eede5d66418abff6e20bd76ca8
16:50:41 PyTorch compiled with CUDA version 11.3
16:50:41 RDKit version 2022.09.5
16:50:41 Platform Linux-5.15.0-97-generic-x86_64-with-glibc2.35
16:50:41 CUDA driver version 550.54.14
16:50:41 Number of PyTorch CUDA devices 1
16:50:41 Using CUDA device:0 NVIDIA L4
16:50:41 GPU memory: 22273 MiB free, 22478 MiB total
16:50:41 Writing TensorBoard summary to /home/bilal.nizami/GenerativeAI/Campestris_Data/Reinvent4_run/tb_TL
16:50:41 Writing JSON config file to /home/bilal.nizami/GenerativeAI/Campestris_Data/Reinvent4_run/json_transfer_learning.json
16:50:41 Starting Transfer Learning
16:50:43 Using generator Mol2Mol
16:50:43 Reading input SMILES from /home/bilal.nizami/GenerativeAI/Campestris_Data/Reinvent4_run/data/campestris_data_smiles_final.filtered.smi
16:50:46 Reading validation SMILES from /home/bilal.nizami/GenerativeAI/Campestris_Data/Reinvent4_run/data/campestris_data_smiles_final.filtered.smi
16:50:50 randomize_smiles set to false for Mol2Mol
additional information:
It might be related to an expired deprecation in pytorch as mentioned here. https://github.com/pytorch/pytorch/issues/91516