Looking for the web interface? Find it here: https://www.blobulator.branniganlab.org/
This tool identifies contiguous stretches of hydrophobic residues within a protein sequence. Any sequence of contiguous hydrophobic residues that is at least as long as the minimum blob length is considered an hydrophobic or h "blob". Any remaining segments that are at least as long as the minimum length are considered polar or p "blobs," while those that are shorter than the minimum blob length are considered separator or "s" residues. Separator residues are very short stretches of non-hydrophobic residues that may be found between two h blobs.
Python 3.9+
[Optional] Create a conda environment:
conda create --name blobulator_env python=3.9
conda activate blobulator_env
[For website and sample scripts] Download the repository:
git clone https://github.com/BranniganLab/blobulator
Install with pip
pip install git+https://github.com/BranniganLab/blobulator
Known issue:
If you get an error installing pycairo, try conda install pycairo
and retry the above.
Note: this option is identical to the website version, but is hosted on your local machine:
cd [path_to_repository]/website
python3 blobulation.py
If a browser doesn't open automatically, copy the url from the terminal into a browser.
import blobulator
# A very simple oligopeptide and standard settings
sequence = "RRRRRRRRRIIIIIIIII"
cutoff = 0.4
min_blob = 4
hscale = "kyte_doolittle"
# Do the blobulation
blobDF = blobulator.compute(sequence, cutoff, min_blob, hscale)
# Cleanup the dataframe (make it more human-readable)
blobDF = blobulator.clean_df(blobDF)
# Save it as a csv for later use
oname = "hello_blob.csv"
blobDF.to_csv(oname, index=False)
Additional sample scripts can be found in the repository examples directory.
The backend can be installed independently using with pip install blobulator
Open a terminal in the blobulator directory and run:
python3 -m blobulator --sequence AFRPGAGQPPRRKECTPEVEEGV --oname ./my_blobulation.csv
This will blobulate the sequence "AFRPGAGQPPRRKECTPEVEEGV" and write the result to my_blobulation.csv
You may specify additional paramters using the following options:
-h, --help show help information and exit
--sequence SEQUENCE Takes a single string of EITHER DNA or protein one-letter codes (no spaces).
--cutoff CUTOFF Sets the cutoff hydrophobicity (floating point number between 0.00 and 1.00 inclusive). Defaults to 0.4
--minBlob MINBLOB Mininmum blob length (integer greater than 1). Defaults to 4
--oname ONAME Name of output file or path to output directory. Defaults to blobulated_.csv
--fasta FASTA FASTA file with 1 or more sequences
--DNA DNA Flag that says whether the inputs are DNA or protein. Defaults to false (protein)
python3 -m blobulator --fasta ./relative/path/to/my_sequences.fasta --oname ./relative/path/to/outputs/
There is a fasta file in blobulation/example called b_subtilis.fasta that contains the sequences of several proteins from Bacillus subtilis. To blobulate all those proteins with a cutoff of 0.4 and a minimum blob size of 4, we run:
mkdir outputs
python3 -m blobulator --fasta ../example/b_subtilis.fasta --cutoff 0.4 --minBlob 4 --oname outputs/
Whether you have blobulated your proteins of interest using the web utility or the command-line option, you can obtain the blobulation data as a csv (the only output of the command line option or by clicking "Download Data" on the website). These CSVs are organized with each residue in its own row and columns as follows:
There is a tcl script in the VMD_scripts directory that will read a csv from the website or the local tool.
To use it:
set protSel [atomselect top "protein"]
get_sequence $protSel
getBlobs my_blobulation.csv $protSel