This repository contains DCEC method (Deep Clustering with Convolutional Autoencoders) implementation with PyTorch with some improvements for network architectures.
The code for clustering was developed for Master Thesis: "Automatic analysis of images from camera-traps" by Michal Nazarczuk from Imperial College London
The following libraries are required to be installed for the proper code evaluation:
The code was written and tested on Python 3.4.1
Just copy the repository to your local folder:
git clone https://github.com/michaal94/torch_DCEC
In order to test the basic version of the semi-supervised clustering just run it with your python distribution you installed libraries for (Anaconda, Virtualenv, etc.). In general type:
cd torch_DCEC
python3 torch_DCEC.py
The example will run sample clustering with MNIST-train dataset.
The algorithm offers a plenty of options for adjustments:
Mode choice: full or pretraining only, use:
--mode train_full
or --mode pretrain
Fot full training you can specify whether to use pretraining phase --pretrain True
or use saved network --pretrain False
and
--pretrained net ("path" or idx)
with path or index (see catalog structure) of the pretrained network
-data_directory (clusters must corespond to real clustering only for statistics)
-cluster_1
-image_1
-image_2
-...
-cluster_2
-image_1
-image_2
-...
-...
Use the following: --dataset MNIST-train
,
--dataset MNIST-test
,
--dataset MNIST-full
or
--dataset custom
(use the last one with path
--dataset_path 'path to your dataset'
and the trasformation you want for images
--custom_img_size [height, width, depth]
)
Different network architectures:
--net_architecture CAE_3
--net_architecture CAE_3bn
--net_architecture CAE_4
and --net_architecture CAE_4bn
--net_architecture CAE_5
and --net_architecture CAE_5bn
(used for 128x128 photos)The following opions may be used for model changes:
--leaky True/False
(True provided better results) --neg_slope value
(Values around 0.01 were used)--activations True/False
(False provided better results)--bias True/False
--rate value
(0.001 is reasonable value for Adam)--rate_pretrain value
(0.001 can be used as well)--weight value
(0 was used)--weight_pretrain value
--sched_step value
--sched_step_pretrain value
--sched_gamma value
--sched_gamma_pretrain value
--gamma value
(Value of 0.1 provided good results)update_interval value
(Value may be chosen such that distribution is updated each 1000-2000 photos)--tol value
(Depends on dataset, for small 0.01 was used for bigger e.g. MNIST - 0.001)--num_clusters value
--batch_size value
(Depend on your device, but remember that too much may be bad for convergence)--epochs value
--epochs_pretrain value
(300 epochs were used, 200 with 0.001 lerning rate and 100 with 10 times smaller - --sched_step_pretrain 200
, --sched_gamma_pretrain 0.1
)--printing_frequency value
--tensorboard True/False
The code creates the following catalog structure when reporting the statistics:
-Reports
-(net_architecture_name)_(index).txt
-Nets (copies of weights
-(net_architecture_name)_(index).pt
-(net_architecture_name)_(index)_pretrained.txt
-Runs
-(net_architecture_name)_(index) <- directory containing tensorboard event file
The files are indexed automatically for the files not to be accidentally overwritten.
For semi-supervised clustering vistit my other repository