Closed mafarsani closed 9 months ago
Hi,
I have updated the docker image, the newest tag is 1.0.1. What I did are bascially the following three things:
To run the new model, simply go to root of your cloned MgNet folder and run the following commands:
git pull
./setup
And then proceeds to run the example case.
Please let know if this fixes the error.
Hello Thank you for the hints and update. I could successfully run it and got the results. Best regards.
I wanted to run this package for the example provided in the tutorial, but I am getting below Error ERROR: Failed to load OptiX shared library. Could you please help me to address the issue? I am printing the system configuration I am using
====================================
Fri Dec 15 14:38:59 2023
+---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3090 Off | 00000000:18:00.0 Off | N/A | | 50% 33C P2 111W / 350W | 542MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 1 NVIDIA GeForce RTX 3090 Off | 00000000:51:00.0 Off | N/A | | 51% 22C P8 25W / 350W | 5MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 2 NVIDIA GeForce RTX 3090 Off | 00000000:8A:00.0 Off | N/A | | 48% 21C P8 18W / 350W | 5MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 3 NVIDIA GeForce RTX 3090 Off | 00000000:C3:00.0 Off | N/A | | 53% 26C P8 19W / 350W | 5MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 6450 C python 534MiB | +---------------------------------------------------------------------------------------+
it is the whole output printed after running the package. in_rna: /home/mfarsani/MgNet/example/example.pdb out_dir: /home/mfarsani/MgNet/example/ preparing /tmp/example.pdb ... ######## get RNA part ######## Info) VMD for LINUXAMD64, version 1.9.4a57 (April 27, 2022) Info) http://www.ks.uiuc.edu/Research/vmd/
Info) Email questions and bug reports to vmd@ks.uiuc.edu
Info) Please include this reference in published work using VMD:
Info) Humphrey, W., Dalke, A. and Schulten, K., `VMD - Visual
Info) Molecular Dynamics', J. Molec. Graphics 1996, 14.1, 33-38. Info) ------------------------------------------------------------- Info) Multithreading available, 32 CPUs. Info) CPU features: SSE2 SSE4.1 AVX AVX2 FMA F16 AVX512F AVX512CD HT Info) Free system memory: 752GB (99%) Info) Creating CUDA device pool and initializing hardware... Info) Detected 4 available CUDA accelerators: Info) [0-3] NVIDIA GeForce RTX 3090 82 SM_8.6 1.7 GHz, 24GB RAM SP32 AE2 ZC OptiXRenderer) ERROR: Failed to load OptiX shared library. OptiXRenderer) NVIDIA driver may be too old. OptiXRenderer) Check/update NVIDIA driver Info) Dynamically loaded 3 plugins in directory: Info) /usr/local/lib/vmd/plugins/LINUXAMD64/molfile /tmp/mgnet/example//example.pdb /tmp/mgnet/example//example_rna.pdb Info) Using plugin pdb for structure file /tmp/mgnet/example//example.pdb Info) Using plugin pdb for coordinates from file /tmp/mgnet/example//example.pdb Info) Determining bond structure from distance search ... Info) Finished with coordinate file /tmp/mgnet/example//example.pdb. Info) Analyzing structure ... Info) Atoms: 810 Info) Bonds: 907 Info) Angles: 0 Dihedrals: 0 Impropers: 0 Cross-terms: 0 Info) Bondtypes: 0 Angletypes: 0 Dihedraltypes: 0 Impropertypes: 0 Info) Residues: 38 Info) Waters: 0 Info) Segments: 1 Info) Fragments: 1 Protein: 0 Nucleic: 1 0 atomselect0 Info) Opened coordinate file /tmp/mgnet/example//example_rna.pdb for writing. Info) Finished with coordinate file /tmp/mgnet/example//example_rna.pdb. Info) VMD for LINUXAMD64, version 1.9.4a57 (April 27, 2022) Info) Exiting normally. vmd > ######## remove altloc ######## ######## generate pdbqt ######## setting PYTHONHOME environment ######## voxelization ######## /opt/conda/lib/python3.6/site-packages/htmd/molecule/util.py:666: NumbaPerformanceWarning: np.dot() is faster on contiguous arrays, called on (array(float32, 2d, A), array(float32, 2d, A)) covariance = np.dot(P.T, Q) /opt/conda/lib/python3.6/site-packages/htmd/molecule/util.py:704: NumbaPerformanceWarning: np.dot() is faster on contiguous arrays, called on (array(float32, 2d, C), array(float32, 2d, A)) all1 = np.dot(all1, rot.T) ffevaluate module is in beta version 2023-12-13 00:00:39,492 - binstar - INFO - Using Anaconda API: https://api.anaconda.org There is something wrong with your /root/.htmd/.latestversion file. Will not check for new HTMD versions. usage: python 3-voxelization.py inrnapdb inpdbqt save_folder example_rna 1 1 G image non zero 13993 Mg non zero 0 occupancies 12300 partial_charges 1693 example_rna 1 2 G image non zero 17894 Mg non zero 0 occupancies 15704 partial_charges 2190 example_rna 1 3 A image non zero 21831 Mg non zero 0 occupancies 19182 partial_charges 2649 example_rna 1 4 U image non zero 25522 Mg non zero 0 occupancies 22414 partial_charges 3108 example_rna 1 5 A image non zero 25482 Mg non zero 0 occupancies 22406 partial_charges 3076 example_rna 1 6 C image non zero 24653 Mg non zero 0 occupancies 21704 partial_charges 2949 example_rna 1 7 A image non zero 24794 Mg non zero 0 occupancies 21819 partial_charges 2975 example_rna 1 8 C image non zero 28194 Mg non zero 0 occupancies 24745 partial_charges 3449 example_rna 1 9 A image non zero 31246 Mg non zero 0 occupancies 27466 partial_charges 3780 example_rna 1 10 A image non zero 27995 Mg non zero 0 occupancies 24567 partial_charges 3428 example_rna 1 11 G image non zero 28724 Mg non zero 0 occupancies 25232 partial_charges 3492 example_rna 1 12 A image non zero 26966 Mg non zero 0 occupancies 23761 partial_charges 3205 example_rna 1 13 G image non zero 26677 Mg non zero 0 occupancies 23400 partial_charges 3277 example_rna 1 14 U image non zero 17950 Mg non zero 0 occupancies 15743 partial_charges 2207 example_rna 1 15 G image non zero 18695 Mg non zero 0 occupancies 16455 partial_charges 2240 example_rna 1 16 A image non zero 16862 Mg non zero 0 occupancies 14994 partial_charges 1868 example_rna 1 17 U image non zero 23760 Mg non zero 0 occupancies 20956 partial_charges 2804 example_rna 1 18 U image non zero 29173 Mg non zero 0 occupancies 25711 partial_charges 3462 example_rna 1 19 G image non zero 30589 Mg non zero 0 occupancies 26948 partial_charges 3641 example_rna 1 20 A image non zero 21475 Mg non zero 0 occupancies 18898 partial_charges 2577 example_rna 1 21 A image non zero 20073 Mg non zero 0 occupancies 17741 partial_charges 2332 example_rna 1 22 A image non zero 38715 Mg non zero 0 occupancies 34095 partial_charges 4620 example_rna 1 23 C image non zero 35548 Mg non zero 0 occupancies 31239 partial_charges 4309 example_rna 1 24 U image non zero 20053 Mg non zero 0 occupancies 17686 partial_charges 2367 example_rna 1 25 A image non zero 20261 Mg non zero 0 occupancies 17797 partial_charges 2464 example_rna 1 26 A image non zero 31248 Mg non zero 0 occupancies 27432 partial_charges 3816 example_rna 1 27 G image non zero 28195 Mg non zero 0 occupancies 24737 partial_charges 3458 example_rna 1 28 U image non zero 29170 Mg non zero 0 occupancies 25629 partial_charges 3541 example_rna 1 29 C image non zero 26772 Mg non zero 0 occupancies 23460 partial_charges 3312 example_rna 1 30 U image non zero 24851 Mg non zero 0 occupancies 21821 partial_charges 3030 example_rna 1 31 G image non zero 30319 Mg non zero 0 occupancies 26612 partial_charges 3707 example_rna 1 32 U image non zero 26658 Mg non zero 0 occupancies 23343 partial_charges 3315 example_rna 1 33 G image non zero 24129 Mg non zero 0 occupancies 21139 partial_charges 2990 example_rna 1 34 U image non zero 23939 Mg non zero 0 occupancies 21013 partial_charges 2926 example_rna 1 35 A image non zero 23318 Mg non zero 0 occupancies 20512 partial_charges 2806 example_rna 1 36 U image non zero 21863 Mg non zero 0 occupancies 19211 partial_charges 2652 example_rna 1 37 C image non zero 18623 Mg non zero 0 occupancies 16322 partial_charges 2301 example_rna 1 38 C image non zero 14214 Mg non zero 0 occupancies 12476 partial_charges 1738
######## predict, density, cluster ######## ==> Resuming from checkpoint -> /src/MgNet/script/model/checkpoint/cv1/ckpt.e40 /opt/conda/lib/python3.6/site-packages/torch/serialization.py:453: SourceChangeWarning: source code of class 'dncon2.Net' has changed. you can retrieve the original source code by accessing the object's source attribute or set
test(start_epoch)
File "/src/MgNet/script//4-predict.py", line 117, in test
outputs = net(inputs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/opt/conda/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, *kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(input, kwargs)
File "/src/MgNet/script/dncon2.py", line 93, in forward
out = self.conv_first(x)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 478, in forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR
torch.nn.Module.dump_patches = True
and use the patch tool to revert the changes. warnings.warn(msg, SourceChangeWarning) len(testset) ---> 38 len(testloader) ---> 38 GPU ---> range(0, 4) Use cudnn ---> 7602 cv_index -> 1 image_dir -> /tmp/mgnet/example//image/ result_dir -> /tmp/mgnet/example//result/cv1//raw/ num_worker -> 30 Traceback (most recent call last): File "/src/MgNet/script//4-predict.py", line 167, inHave 3 arguments: /src/MgNet/script//density/density /tmp/mgnet/example//result/cv1//raw/ 0.5 Traceback (most recent call last): File "/src/MgNet/script//5-cluster.py", line 34, in
assert os.path.exists(density_folder), f'Error: density_folder does not exist -> {density_folder}'
AssertionError: Error: density_folder does not exist -> /tmp/mgnet/example//result/cv1//density/
==> Resuming from checkpoint -> /src/MgNet/script/model/checkpoint/cv2/ckpt.e40
/opt/conda/lib/python3.6/site-packages/torch/serialization.py:453: SourceChangeWarning: source code of class 'dncon2.Net' has changed. you can retrieve the original source code by accessing the object's source attribute or set
test(start_epoch)
File "/src/MgNet/script//4-predict.py", line 117, in test
outputs = net(inputs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/opt/conda/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, *kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(input, kwargs)
File "/src/MgNet/script/dncon2.py", line 93, in forward
out = self.conv_first(x)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 478, in forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR
torch.nn.Module.dump_patches = True
and use the patch tool to revert the changes. warnings.warn(msg, SourceChangeWarning) len(testset) ---> 38 len(testloader) ---> 38 GPU ---> range(0, 4) Use cudnn ---> 7602 cv_index -> 2 image_dir -> /tmp/mgnet/example//image/ result_dir -> /tmp/mgnet/example//result/cv2//raw/ num_worker -> 30 Traceback (most recent call last): File "/src/MgNet/script//4-predict.py", line 167, inHave 3 arguments: /src/MgNet/script//density/density /tmp/mgnet/example//result/cv2//raw/ 0.5 Traceback (most recent call last): File "/src/MgNet/script//5-cluster.py", line 34, in
assert os.path.exists(density_folder), f'Error: density_folder does not exist -> {density_folder}'
AssertionError: Error: density_folder does not exist -> /tmp/mgnet/example//result/cv2//density/
==> Resuming from checkpoint -> /src/MgNet/script/model/checkpoint/cv3/ckpt.e40
/opt/conda/lib/python3.6/site-packages/torch/serialization.py:453: SourceChangeWarning: source code of class 'dncon2.Net' has changed. you can retrieve the original source code by accessing the object's source attribute or set
test(start_epoch)
File "/src/MgNet/script//4-predict.py", line 117, in test
outputs = net(inputs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/opt/conda/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, *kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(input, kwargs)
File "/src/MgNet/script/dncon2.py", line 93, in forward
out = self.conv_first(x)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 478, in forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR
torch.nn.Module.dump_patches = True
and use the patch tool to revert the changes. warnings.warn(msg, SourceChangeWarning) len(testset) ---> 38 len(testloader) ---> 38 GPU ---> range(0, 4) Use cudnn ---> 7602 cv_index -> 3 image_dir -> /tmp/mgnet/example//image/ result_dir -> /tmp/mgnet/example//result/cv3//raw/ num_worker -> 30 Traceback (most recent call last): File "/src/MgNet/script//4-predict.py", line 167, inHave 3 arguments: /src/MgNet/script//density/density /tmp/mgnet/example//result/cv3//raw/ 0.5 Traceback (most recent call last): File "/src/MgNet/script//5-cluster.py", line 34, in
assert os.path.exists(density_folder), f'Error: density_folder does not exist -> {density_folder}'
AssertionError: Error: density_folder does not exist -> /tmp/mgnet/example//result/cv3//density/
==> Resuming from checkpoint -> /src/MgNet/script/model/checkpoint/cv4/ckpt.e40
/opt/conda/lib/python3.6/site-packages/torch/serialization.py:453: SourceChangeWarning: source code of class 'dncon2.Net' has changed. you can retrieve the original source code by accessing the object's source attribute or set
test(start_epoch)
File "/src/MgNet/script//4-predict.py", line 117, in test
outputs = net(inputs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/opt/conda/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, *kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(input, kwargs)
File "/src/MgNet/script/dncon2.py", line 93, in forward
out = self.conv_first(x)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 478, in forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR
torch.nn.Module.dump_patches = True
and use the patch tool to revert the changes. warnings.warn(msg, SourceChangeWarning) len(testset) ---> 38 len(testloader) ---> 38 GPU ---> range(0, 4) Use cudnn ---> 7602 cv_index -> 4 image_dir -> /tmp/mgnet/example//image/ result_dir -> /tmp/mgnet/example//result/cv4//raw/ num_worker -> 30 Traceback (most recent call last): File "/src/MgNet/script//4-predict.py", line 167, inHave 3 arguments: /src/MgNet/script//density/density /tmp/mgnet/example//result/cv4//raw/ 0.5 Traceback (most recent call last): File "/src/MgNet/script//5-cluster.py", line 34, in
assert os.path.exists(density_folder), f'Error: density_folder does not exist -> {density_folder}'
AssertionError: Error: density_folder does not exist -> /tmp/mgnet/example//result/cv4//density/
==> Resuming from checkpoint -> /src/MgNet/script/model/checkpoint/cv5/ckpt.e40
/opt/conda/lib/python3.6/site-packages/torch/serialization.py:453: SourceChangeWarning: source code of class 'dncon2.Net' has changed. you can retrieve the original source code by accessing the object's source attribute or set
test(start_epoch)
File "/src/MgNet/script//4-predict.py", line 117, in test
outputs = net(inputs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/opt/conda/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, *kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(input, kwargs)
File "/src/MgNet/script/dncon2.py", line 93, in forward
out = self.conv_first(x)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 478, in forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR
torch.nn.Module.dump_patches = True
and use the patch tool to revert the changes. warnings.warn(msg, SourceChangeWarning) len(testset) ---> 38 len(testloader) ---> 38 GPU ---> range(0, 4) Use cudnn ---> 7602 cv_index -> 5 image_dir -> /tmp/mgnet/example//image/ result_dir -> /tmp/mgnet/example//result/cv5//raw/ num_worker -> 30 Traceback (most recent call last): File "/src/MgNet/script//4-predict.py", line 167, inHave 3 arguments: /src/MgNet/script//density/density /tmp/mgnet/example//result/cv5//raw/ 0.5 Traceback (most recent call last): File "/src/MgNet/script//5-cluster.py", line 34, in
assert os.path.exists(density_folder), f'Error: density_folder does not exist -> {density_folder}'
AssertionError: Error: density_folder does not exist -> /tmp/mgnet/example//result/cv5//density/
######## MgNet completed ########