GATSol

GATSol, an enhanced predictor of protein solubility through the synergy of 3D structure graph and large language modeling

0.Conda environmennt create

If you want to run this model in a standalone mode on your own linux system, git-clone ProtSol with the following code:

git clone https://github.com/binbinbinv/GATSol.git

Then install the GATSol environment with the following commands, but first make sure you have conda or miniconda installed on your server.

Then you can install GATSol environment manually by following the instructions:

conda create -n GATSol python=3.9
conda activate GATSol
pip install torch==2.2.2 torchvision torchaudio
pip install pandas bio seaborn matplotlib_inline
pip install scikit-learn transformers Ipython
pip install iFeatureOmegaCLI rdkit
pip install torch_geometric==2.3.0 fair-esm

1.Predict your own protein

Download the best model after training by following the readme.md file in GATSol/check_point/best_model/readme.md, and put it into the best_model folder.
You must prepare your protein as the format fasta and pdb, and then put them in the folder below:

①GATSol/Predict/NEED_to_PREPARE/fasta

②GATSol/Predict/NEED_to_PREPARE/pdb

And you need to prepare a list.csv as the example in the /home/bli/GATSol/Predict/NEED_to_PREPARE.
After preparing all the files, cd to the prediction work folder and execute the following command, you will get the Output.csv file, which contains the prediction results you need.
```
cd GATSol/Predict
bash ./tools/Predict.sh
```
2.Re-train the model
cd to the GAT project directory
```
cd GATSol/
```
Download the datasets follow the description in ./dataset/readme.md

Extract the dataset by the command:

tar -zxvf  ~/GATSol_dataset.tar.gz -C ~/GATSol/dataset/

Retrain the model
```
python re-train.py
```

binbinbinv / GATSol

readme

GATSol

0.Conda environmennt create

1.Predict your own protein

2.Re-train the model