A deep learning based open-source software with a friendly user interface to pick 3D particles rapidly and accurately from cryo-electron tomograms. With the advantages of weak labels, lightweight architecture and GPU-accelerated pooling operations, the cost of annotations and the time of computational inference are significantly reduced while the accuracy is greatly improved by applying a Gaussian-type mask and using a customized architecture design.
Note: DeepETPicker is a Pytorch implementation.
The following instructions assume that pip
and anaconda
or miniconda
are available. In case you have a old deepetpicker environment installed, first remove the old one with:
conda env remove --name deepetpicker
The first step is to crate a new conda virtual environment:
conda create -n deepetpicker -c conda-forge python=3.8.3 -y
Activate the environment:
conda activate deepetpicker
To download the codes, please do:
git clone https://github.com/cbmi-group/DeepETPicker
cd DeepETPicker
Next, install a custom pytorch and relative packages needed by DeepETPicker:
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch -y
pip install -r requirement.txt
if using GUI
To use GUI packages with Linux, you will need to install the following extended dependencies for Qt.
For CentOS
, to install packages, please do:
sudo yum install -y mesa-libGL libXext libSM libXrender fontconfig xcb-util-wm xcb-util-image xcb-util-keysyms xcb-util-renderutil libxkbcommon-x11
For Ubuntu
, to install packages, please do:
sudo apt-get install -y libgl1-mesa-glx libglib2.0-dev libsm6 libxrender1 libfontconfig1 libxcb-icccm4 libxcb-image0 libxcb-keysyms1 libxcb-render-util0 libxcb-shape0 libxcb-xinerama0 libxcb-xkb1 libxkbcommon-x11-dev libdbus-1-3
To run the DeepETpicker, please do:
conda activate deepetpicker
python PATH_TO_DEEPETPICKER/main.py
Note: PATH_TO_DEEPETPICKER
is the corresponding directory where the code located.
Non GUI version
In addition to the GUI version
of DeepETPicker, we also provide a non-GUI version
of DeepETPicker for people who understand python and deep-learning. It consists of four processes, including preprocessing
, train config generation
, training
and testing
. A sample tutorial can be found in .bin/bash_command.md
.
The following steps are required in order to run DeepETPicker:
Install Docker
Note: docker engine version shuold be >= 19.03. The size of Docker mirror of Deepetpicker is 7.21 GB, please ensure that there is enough memory space.
Install NVIDIA Container Toolkit for GPU support.
Download Docker image of DeepETPicker.
docker pull docker.io/lgl603/deepetpicker:latest
Run the Docker image of DeepETPicker.
docker run --gpus=all -itd \
--restart=always \
--shm-size=100G \
-e DISPLAY=unix$DISPLAY \
--name deepetpicker \
-p 50022:22 \
--mount type=bind,source='/host_path/to/data',target='/container_path/to/data' \
lgl603/deepetpicker:latest
--shm-size
is used to set the required size of shared momory of the docker containers.The option --mount
is used to mount a file or directory on the host machine into the Docker container, where source='/host_path/to/data'
denotes denotes the data directory really existed in host machine. target='/container_path/to/data'
is the data directory where the directory '/host_path/to/data'
is mounted in the container.
Note: '/host_path/to/data'
should be writable by running bash command chmod -R 777 '/host_path/to/data'
. '/host_path/to/data'
should be replaced by the data directory real existed in host machine. For convenience, '/container_path/to/data'
can set the same as '/host_path/to/data'
The DeepETPicker can be used directly in this machine, and it also can be used by a machine in the same LAN.
Directly open DeepETPicker in this machine:
ssh -X test@'ip_address' DeepETPicker
# where the 'ip_address' of DeepETPicker container can be obtained as follows:
docker inspect --format='{{.NetworkSettings.IPAddress}}' deepetpicker
Connect to this server remotely and open DeepETPicker software via a client machine:
ssh -X -p 50022 test@ip DeepETPicker
Here ip
is the IP address of the server machine,password is password
.
Installation time
: the size of Docker mirror of Deepetpicker is 7.21 GB, and the installation time depends on your network speed. When the network speed is fast enough, it can be configured within a few minutes.
Detailed tutorials for two sample datasets of SHREC2021 and EMPIAR-10045 are provided. Main steps of DeepETPicker includeds preprocessing, traning of DeepETPicker, inference of DeepETPicker, and particle visualization.
Data preparation
Before launching the graphical user interface, we recommend creating a single folder to save inputs and outputs of DeepETpicker. Inside this base folder you should make a subfolder to store raw data. This raw_data folder should contain:
Here, we provides two sample datasets of EMPIAR-10045 and SHREC_2021 for particle picking to enable you to learn the processing flow of DeepETPicker better and faster. The sample dataset can be download in one of two ways:
Data structure
The data should be organized as follows:
├── /base/path
│ ├── raw_data
│ │ ├── tomo1.coords
│ │ └── tomo1.mrc
│ │ └── tomo2.coords
│ │ └── tomo2.mrc
│ │ └── tomo3.mrc
│ │ └── tomo4.mrc
│ │ └── ...
For above data, tomo1.mrc
and tomo2.mrc
can be used as train/val dataset, since they all have coordinate files (matual annotation). If a tomogram has no matual annotation (such as tomo3.mrc
), it only can be used as test dataset.
Input & Output
Launch the graphical user interface of DeepETPicker. On the Preprocessing
page, please set some key parameters as follows:
input
Output
Note: Before `Training of DeepETPicker`, please do `Preprocessing` first to ensure that the basic parameters required for training are provided.
In practice, default parameters can give you a good enough result.
Training parameter description:
Dataset list
to obain the dataset ids firstly. One or multiple tomograms can be used as training tomograms. But make sure that the traning dataset ids
are selected from {0, 1, 2, ..., n-1}, where n is the total number of tomograms obtained from Dataset list
. Here, we provides two ways to set dataset ids:
Dataset list
to obain the dataset ids firstly. Note: only one tomogram can be selected as val dataset.Load configs
next time instead of filling them again.In practice, default parameters can give you a good enough result.
Inference parameter description:
Training
stepTraining
stepCoord format conversion
The predicted coordinates with extension *.coords
has four columns: class_id, x, y, z
. To facilitate users to perform the subsequent subtomogram averaging, format conversion of coordinate file is provided.
*.box
for EMAN2, *.star
for RELION, *.coords
for RELION.Showing Tomogram
You can click the Load tomogram
button on this page to load the tomogram.
Showing Labels
After loading the coordinates file by clicking Load labels
, you can click Show result
to visualize the label. The label's diameter and width also can be tuned on the GUI.
Parameter Adjustment
In order to increase the visualization of particles, Gaussian filtering and histogram equalization are provided:
Mantual picking
After loading the tomogram and pressing ‘enable’, you can pick particles manually by double-click the left mouse button on the slices. If you want to delete an error labeling, just right-click the mouse. You can specify a different category id per class. Always remember to save the resuls when you finish.
Position Slider
You can scan through the volume in x, y and z directions by changing their values. For z-axis scanning, shortcut keys of Up/Down arrow can be used.
If you encounter any problems during installation or use of DeepETPicker, please contact us by email guole.liu@ia.ac.cn. We will help you as soon as possible.
If you use this code for your research, please cite our paper DeepETPicker: Fast and accurate 3D particle picking for cryo-electron tomography using weakly supervised deep learning.
@article{DeepETPicker,
title={DeepETPicker: Fast and accurate 3D particle picking for cryo-electron tomography using weakly supervised deep learning},
author={Guole Liu, Tongxin Niu, Mengxuan Qiu, Yun Zhu, Fei Sun, and Ge Yang},
journal={Nature Communications},
year={2024}
}