Wenhao-Jin / HydRA

A deep-learning model for predicting RNA-binding capacity from protein interaction association context and protein sequence
Other
6 stars 2 forks source link

Missing Annotation track in the Occlusion map #5

Open K-bandana opened 1 week ago

K-bandana commented 1 week ago

Hi,

HydRA is failing to add annotation track to the occlusion map. I am using the parameters and annotation file as recommended. Could you help me fix this issue?

Many thanks, Bandana

Wenhao-Jin commented 1 week ago

Hi @K-bandana, could you post more details of the error the program came up? Also, just a reminder, the annotation file required should be in CSV format which uses comma to separate each column (sorry, the example we provided in the README.md page was a bit confusing).

K-bandana commented 1 week ago

Hi, The annotation file (Occlusion_map_annotation_file.csv), I am using has the format: Start,Stop,Type,region_name,Protein

After running the following parameter, occlusion map is generated with some warnings. occlusion_map3 -s sequences \ --proteinBERT_modelfile ProteinBERT_TrainWithWholeProteinSet_defaultSetting_ModelFile.pkl \ --annotation_file Occlusion_map_annotation_file.csv \ --draw_ensemble_only

2024-10-08 16:50:41.509464: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2024-10-08 16:50:41.509490: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. Occlusion map is started with the trained HydRa model!

use_Zscore = True LOOK HERE0. LOOK FORWARD0. 2024-10-08 16:50:46.929093: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected 2024-10-08 16:50:46.929130: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host: /proc/driver/nvidia/version does not exist 2024-10-08 16:50:46.929559: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

The maximum length of the sequences is : 1500 /conda/envs/HydRA/lib/python3.8/site-packages/keras/optimizer_v2/optimizer_v2.py:355: UserWarning: The lr argument is deprecated, use learning_rate instead. warnings.warn( 2024-10-08 16:50:48.685415: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)

Wenhao-Jin commented 1 week ago

Hi @K-bandana, thank you for the information! I just tested and fixed a bug. You could upgrade your HydRA software using the following command: pip3 install hydra-rbp --no-deps --upgrade and try it again. Besides, I also suggest you to check out the followings points that might be helpful when preparing the annotation file:

  1. Make sure the value in the "Protein" field is matched with the name of your sequence file (excluding the suffix). For instance, if your filename is XXX.fasta, then the values in the Protein Column of all the rows associated with this sequence should be XXX. The program uses the Protein field to find the correct protein sequence to annotate.
  2. Please make sure the column names are in the correct upper case and lower case, e.g. Start,Stop,Type,Region_name,Protein,Color.

The following is an example of the annotation file for Q9VBK9.fasta :

Start,Stop,Type,Region_name,Protein,Color
26,331,domain,DUF2465,Q9VBK9, pink
127,128,disorder,,Q9VBK9,

Please just let me know if you still have problem running this function. Thank you!

K-bandana commented 1 week ago

It worked. Just to let you know that the scale indicating amino acid position does not appear with the updated version. Thanks a lot for your help.