Progress on Variational Autoencoder

JaCoderX commented 6 years ago

this is somewhat a follow up question to Using Autoencoders in btgym architecture.

I'm interested in experimenting with your research algo "Stacked LSTM with auxillary b-Variational AutoEncoder" and also combining it with the PPO training module.

will it make sense to change class bVAENA3C(BaseAAC): to class bVAENA3C(PPO): (in b_vae_a3c.py) just to experiment with it? or it is not compatible/have collisions (like with the 'name' parameter)/doesn't make sense at all?

Kismuz commented 6 years ago

@JacobHanouna,

will it make sense to change class bVAENA3C(BaseAAC): to class bVAENA3C(PPO)

yes, you can see that PPO class is simply wrapper specifying particular loss function to BaseAAC; I should warn, however, that my experiments with b-VAE architecture didn't yield any sensible results; it either can be some implementation fault or something else; using casual convolution encoders turned out to be far more beneficial (see btgym/research/casual_conv/) in terms of policy optimisation performance.

JaCoderX commented 6 years ago

my experiments with b-VAE architecture didn't yield any sensible results

That is very disappointing to hear, really had high hopes for variational autoencoder as they are state-of-the-art algorithm and perform very well in many fields.

using casual convolution encoders turned out to be far more beneficial (see btgym/research/casual_conv/) in terms of policy optimisation performance.

in btgym/research/casual_conv/networks.py there are two encoders:

conv_1d_casual_encoder
conv_1d_casual_attention_encoder

form your experience does the with 'attention' version gives better results?

if I want to experiment with that, would changing 'btgym/research/casual_conv/policy.py' be enough:

class CasualConvPolicy_0_0(AacStackedRL2Policy):f 

    def __init__(
        self,
        state_encoder_class_ref=conv_1d_casual_attention_encoder,

also it looks like no special training algo is needed for those autoencoder, right? just using the normal PPO?

sorry if some of my questions are trivial, I'm learning as I go ( no special algo there :))

Kismuz commented 6 years ago

really had high hopes for variational autoencoder

as I said it can be my implementation flaw or just matter of hyperparameters tuning, etc; maybe you can do better if you dive into it; the other side is bVAE to my best knowledge does well in domains where adequate data model can be learnt with relatively low number of generating factors; it can simply not be the case with financial data; anyway it is an open direction for research;

form your experience does the with 'attention' version gives better results

not much but there is high probability to improve it by implementing multi-head attention; an open direction also;

also it looks like no special training algo is needed for those autoencoder, right? just using the normal PPO?

exactly;

JaCoderX commented 6 years ago

thanks for the replies :)

Kismuz / btgym

Progress on Variational Autoencoder #75