Training code for RibonanzaNet.
You may not want to retrain RibonanzaNet from scratch and rather just use pretrained checkpoints, so we have created example notebooks: \ finetune: https://www.kaggle.com/code/shujun717/ribonanzanet-2d-structure-finetune \ secondary structure inference: https://www.kaggle.com/code/shujun717/ribonanzanet-2d-structure-inference \ chemical mapping inference: https://www.kaggle.com/code/shujun717/ribonanzanet-inference
You just need train_data.csv
, test_sequences.csv
, and sample_submission.csv
from
https://www.kaggle.com/competitions/stanford-ribonanza-rna-folding/data
Create the environment from the environment file env.yml
conda env create -f env.yml
Install ranger optimizer
conda activate torch
git clone https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer
cd Ranger-Deep-Learning-Optimizer
pip install -e .
First activate environment conda activate torch
Set up accelerate with accelerate config
in the terminal or with --config_path
option
For an example of a accelerate config file, see accelerate_config.yaml
accelerate launch run.py --config_path configs/pairwise.yaml
accelerate launch inference.py --config_path configs/pairwise.yaml
python make_submission.py --config_path configs/pairwise.yaml
This section explains the various parameters and settings in the configuration file for RibonanzaNet
learning_rate
: 0.001
The learning rate for the optimizer. Determines the step size at each iteration while moving toward a minimum of the loss function.
batch_size
: 2
Number of samples processed per GPU per batch.
test_batch_size
: 8
Batch size used for testing the model per GPU per batch.
epochs
: 40
Total number of training epochs the model goes through.
dropout
: 0.05
The dropout rate for regularization to prevent overfitting. It represents the proportion of neurons that are randomly dropped out of the neural network during training.
weight_decay
: 0.0001
Regularization technique to prevent overfitting by penalizing large weights.
k
: 5
1D Convolution kernel size
ninp
: 256
The size of the input dimension.
nlayers
: 9
Number of RibonanzaNet blocks.
nclass
: 2
Number of classes for classification tasks.
ntoken
: 5
Number of tokens (AUGC + padding/N token) used in the model.
nhead
: 8
The number of heads in multi-head attention models.
use_flip_aug
: true
Indicates whether flip augmentation is used during training/inference.
gradient_accumulation_steps
: 2
Number of steps to accumulate gradients before performing a backward/update pass.
use_triangular_attention
: false
Specifies whether to use triangular attention mechanisms in the model.
pairwise_dimension
: 64
Dimension of pairwise interactions in the model.
use_data_percentage
: 1
The fraction of data used from the dataset (1= full data training).
use_dirty_data
: true
Indicates whether to include training data that has only one of 2A3/DMS profiles with SN>1.
fold
: 0
The current fold in use if the data is split into folds for cross-validation.
nfolds
: 6
Total number of folds for cross-validation.
input_dir
: "../../input/"
Directory for input data. Put train_data.csv
, test_sequences.csv
, and sample_submission.csv
here.
gpu_id
: "0"
Identifier for the GPU used for training. Useful in single-GPU setup.
logs
has the csv log file with train/val oss,
models
has model weights and optimizer states,
oofs
has the val predictions