This repository contains the implementation of a GAN-based method for real-valued financial time series generation. See for instance Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs.
Main features:
During conditional training, daily deltas that are given as additional input to the generator are sampled from a Gaussian distribution estimated from real data via maximum likelihood.
Considering the original data provided in csv format, the values for the time series are obtained from the feature btp_price.
Minimal preprocessing, including normalization in the range [-1,1], is done inside btp_dataset.py
. The resulting dataset has 173 sequences of length 96, for an overall tensor shape of (173 x 96 x 1).
If you use a dataset that is not compatible with this preprocessing, you can just write your own loader.
The files and directories composing the project are:
main.py
: runs the training. It can save the model checkpoints and images of generated time series, and features visualizations (loss, gradients) via tensorboard. Run python main.py -h
to see all the options.generate_dataset.py
: generates a fake dataset using a trained generator. The path of the generator checkpoint and of the output *.npy file for the dataset must be passed as options. Optionally, the path of a file containing daily deltas (one per line) for conditioning the time series generation can be provided.finetune_model.py
: uses pure supervised training for finetuning a trained generator. Discouraged, it is generally better to train in supervised and unsupervised way jointly. models/
: directory containing the model architecture for both discriminator and generator.utils.py
: contains some utility functions. It also contains a DatasetGenerator
class that is used for fake dataset generation.main_cgan.py
: runs training with standard conditional GANs. Cannot produce nice results, but it is kept for reference.By default, during training, model weights are saved into the checkpoints/
directory, snapshots of generated series into images/
and tensorboard logs into log/
.
Use:
tensorboard --logdir log
from inside the project directory to run tensoboard on the default port (6006).
Run training with recurrent generator and convolutional discriminator, conditioning generator on deltas and alternating adversarial and supervised optimization:
python main.py --dataset_path some_dataset.csv --delta_condition --gen_type lstm --dis_type cnn --alternate --run_tag cnn_dis_lstm_gen_alternte_my_first_trial
Generate fake dataset prova.npy
using deltas contained in delta_trial.txt
and model trained for 70 epochs:
python generate_dataset.py --delta_path delta_trial.txt --checkpoint_path checkpoints/cnn_conditioned_alternate1_netG_epoch_70.pth --output_path prova.npy
Finetune checkpoint of generator with supervised training:
python finetune_model.py --checkpoint checkpoints/cnn_dis_lstm_gen_noalt_new_netG_epoch_39.pth --output_path finetuned.pth