This is a PyTorch implementation of our paper:
Lei Sun, Kui Xu, Wenze Huang, Yucheng T. Yang, Pan Li, Lei Tang, Tuanlin Xiong, Qiangfeng Cliff Zhang
*: indicates equal contribution.
Cell Research Version: (https://www.nature.com/articles/s41422-021-00476-y)
bioRxiv preprint: (https://www.biorxiv.org/content/10.1101/2020.05.05.078774v1)
Clone repository:
git clone https://github.com/kuixu/PrismNet.git
Install packages:
cd PrismNet
pip install -r requirements.txt
pip install -e .
Scripts and pipeline are in preparing, currently, we provide 172 samples data in *.tsv format for training and testing PrismNet.
# Download data
cd PrismNet/data
wget https://zhanglabnet.oss-cn-beijing.aliyuncs.com/prismnet/data/clip_data.tgz
tar zxvf clip_data.tgz
# Generate training and validation set for binary classification
cd PrismNet
tools/gdata_bin.sh
To train one single protein model from scratch, run
exp/EXP_NAME/train.sh pu PrismNet TIA1_Hela clip_data
where you replace TIA1_Hela
with the name of the data file you want to use, you replace EXP_NAME with a specific name of this experiment. Hyper-parameters could be tuned in exp/prismnet/train.sh
. For available training options, please take a look at tools/train.py
.
To monitor the training process, add option -tfboard
in exp/prismnet/train.sh
, and view page at http://localhost:6006 using tensorboard:
tensorboard --logdir exp/EXP_NAME/out/tfb
To train all the protein models, run
exp/EXP_NAME/train_all.sh
For evaluation of the models, we provide the script eval.sh
. You can run it using
exp/prismnet/eval.sh TIA1_Hela clip_data
For inference data (the same format as the *.tsv file used in Datasets) using the trained models, we provide the script infer.sh
. You can run it using
exp/prismnet/infer.sh TIA1_Hela /path/to/inference_file.tsv
For computing high attention regions using the trained models, we provide the script har.sh
. You can run it using
exp/prismnet/har.sh TIA1_Hela /path/to/inference_file.tsv
For computing saliency using the trained models, we provide the script saliency.sh
. You can run it using
exp/prismnet/saliency.sh TIA1_Hela /path/to/inference_file.tsv
For plotting saliency image using the trained models, we provide the script saliencyimg.sh
. You can run it using
exp/prismnet/saliencyimg.sh TIA1_Hela /path/to/inference_file.tsv
For the construction and analysis of integrative motifs, Users can use the scripts in motif_construct/
perl saliency_motif.pl infile.txt sal outfile
Rscript motif_sig.R outfile_motif_summary.txt outfile_motif_sig.txt
The integrative motif could be downloaded at here.
cd PrismNet/data
wget https://zhanglabnet.oss-cn-beijing.aliyuncs.com/prismnet/data/halflife_data.tgz
tar zxvf halflife_data.tgz
pip install xgboost==1.3.0rc1 matplotlib scipy scikit-learn termplotlib
exp/logistic_reg/run.sh
We also provide a website http://prismnet.zhanglab.net/ to visualize the icSHAPE date and the results.
This project is free to use for non-commercial purposes - see the LICENSE file for details.
@article {Sun2021cr,
title = {Predicting dynamic cellular protein-RNA interactions using deep learning and in vivo RNA structure},
author = {Sun, Lei and Xu, Kui and Huang, Wenze and Yang, Yucheng T. and Li, Pan and Tang, Lei and Xiong, Tuanlin and Zhang, Qiangfeng Cliff},
year = {2021},
doi = {https://doi.org/10.1038/s41422-021-00476-y},
journal = {Cell Research}
}
@article {Sun2021cell,
title = {In vivo structural characterization of the whole SARS-CoV-2 RNA genome identifies host cell target proteins vulnerable to re-purposed drugs},
author = {Sun, Lei and Li, Pan and Ju, Xiaohui and Rao, Jian and Huang, Wenze and Zhang, Shaojun and Xiong, Tuanlin and Xu, Kui and Zhou, Xiaolin and Ren, Lili and Ding, Qiang and Wang, Jianwei and Zhang, Qiangfeng Cliff},
year = {2021},
doi = {https://doi.org/10.1016/j.cell.2021.02.008},
journal = {Cell}
}