Closed JaCoderX closed 6 years ago
@JacobHanouna,
will it make sense to change class bVAENA3C(BaseAAC): to class bVAENA3C(PPO)
yes, you can see that PPO class is simply wrapper specifying particular loss function to BaseAAC;
I should warn, however, that my experiments with b-VAE architecture didn't yield any sensible results; it either can be some implementation fault or something else; using casual convolution
encoders turned out to be far more beneficial (see btgym/research/casual_conv/) in terms of policy optimisation performance.
my experiments with b-VAE architecture didn't yield any sensible results
That is very disappointing to hear, really had high hopes for variational autoencoder as they are state-of-the-art algorithm and perform very well in many fields.
using casual convolution encoders turned out to be far more beneficial (see btgym/research/casual_conv/) in terms of policy optimisation performance.
in btgym/research/casual_conv/networks.py there are two encoders:
form your experience does the with 'attention' version gives better results?
if I want to experiment with that, would changing 'btgym/research/casual_conv/policy.py' be enough:
class CasualConvPolicy_0_0(AacStackedRL2Policy):f
def __init__(
self,
state_encoder_class_ref=conv_1d_casual_attention_encoder,
also it looks like no special training algo is needed for those autoencoder, right? just using the normal PPO?
sorry if some of my questions are trivial, I'm learning as I go ( no special algo there :))
really had high hopes for variational autoencoder
as I said it can be my implementation flaw or just matter of hyperparameters tuning, etc; maybe you can do better if you dive into it; the other side is bVAE to my best knowledge does well in domains where adequate data model can be learnt with relatively low number of generating factors; it can simply not be the case with financial data; anyway it is an open direction for research;
form your experience does the with 'attention' version gives better results
not much but there is high probability to improve it by implementing multi-head attention; an open direction also;
also it looks like no special training algo is needed for those autoencoder, right? just using the normal PPO?
exactly;
thanks for the replies :)
this is somewhat a follow up question to Using Autoencoders in btgym architecture.
I'm interested in experimenting with your research algo "Stacked LSTM with auxillary b-Variational AutoEncoder" and also combining it with the PPO training module.
will it make sense to change
class bVAENA3C(BaseAAC):
toclass bVAENA3C(PPO):
(in b_vae_a3c.py) just to experiment with it? or it is not compatible/have collisions (like with the 'name' parameter)/doesn't make sense at all?