AI4Finance-Foundation / FinRL

FinRL: Financial Reinforcement Learning. 🔥
https://ai4finance.org
MIT License
10.15k stars 2.45k forks source link

Use LSTM as the feature extractor of financial data #688

Open Zero1366166516 opened 2 years ago

Zero1366166516 commented 2 years ago

Hello, I have LSTM as the feature extractor of financial data the problem. I also want to build an LSTM feature extractor on Sb3 for financial time series data, using mlppolicy strategy network. I want to ask whether fin-rl now has an LSTM algorithm feature extractor. When looking up the problem #195 in finrl, I now use the Sb3 framework in finrl, and support LSTM as the feature extractor in SB1. How can I smoothly migrate back. Thank you for your previous exploration.

aidanmclaughlin commented 2 years ago

Great question, was wondering same thing

YangletLiu commented 2 years ago

@Zero1366166516 great question! Your codes of LSTM as feature extractor are welcome! I will talk with one or two group members to work with you, it is very valuable to provide to our community members. I heard several times when users requested this feature.

Zero1366166516 commented 2 years ago

I previously made a CNN feature extractor for mlppolicy strategy network, which can be shared with everyone. CNN feature extractor :


class CustomCNN(BaseFeaturesExtractor):
    """
    :param observation_space: (gym.Space)
    :param features_dim: (int) Number of features extracted.
        This corresponds to the number of unit for the last layer.
    """

    def __init__(self, observation_space: gym.spaces.Box, features_dim: int = 1):
        super(CustomCNN, self).__init__(observation_space, features_dim)
        # We assume CxHxW images (channels first)
        # Re-ordering will be done by pre-preprocessing or wrapper
        n_input_channels = observation_space.shape[0]

        #print("n_input_channels and output_channels", observation_space.shape[0], observation_space.shape)
        #print("features = ", observation)

        self.cnn = nn.Sequential(

            nn.Conv1d(self.features_dim, n_input_channels, kernel_size=1, stride=1, padding=0),
            nn.ReLU(),
            nn.Conv1d(n_input_channels, self.features_dim, kernel_size=1, stride=1, padding=0),
            nn.ReLU(),
            nn.Flatten(),
        )
        #print(self.cnn.type)

        # Compute shape by doing one forward pass
        with th.no_grad():
            n_flatten = self.cnn(
                th.as_tensor(observation_space.sample()[None]).float()
            ).shape[1]
            #print("n_flatten", n_flatten)
            ##print("cnn", self.cnn)

        self.linear = nn.Sequential(nn.Linear(n_flatten, features_dim), nn.Tanh())
        #print(self.linear)
        #self.linear = th.as_tensor(self.linear)
        #exit()

    def forward(self, observations: th.Tensor) -> th.Tensor:
        ##global features_dim, n_flatten
        with th.no_grad():
            n_flatten = np.array(observations).shape[-1]
            features_dim = np.array(observations).shape[-2]
            #print(features_dim, n_flatten, np.array(observations).shape)
            #exit()
            i = 0
            j = 0
            if features_dim != 1:
                self.cnn = nn.Sequential(
                    nn.Conv1d(features_dim, n_flatten, kernel_size=1, stride=1, padding=0),
                    nn.ReLU(),
                    nn.Conv1d(n_flatten, features_dim, kernel_size=1, stride=1, padding=0),
                    nn.ReLU(),
                    nn.Flatten(),
                )
                self.linear = nn.Sequential(nn.Linear(n_flatten, 1), nn.Tanh())
                i += 1
                # print("cnn type is ", self.cnn)
                # print(self.linear)
                # print("go to the forward:", observations)
                # print("the observations number:", np.array(observations).shape)
                # print("output data :", self.linear(self.cnn(observations)))
                # print("output data number:", np.array(self.linear(self.cnn(observations))).shape)
                # print("this is I in {} time".format(i))

            else:
                j += 1
                self.cnn = nn.Sequential(
                    nn.Conv1d(features_dim, n_flatten, kernel_size=1, stride=1, padding=0),
                    nn.ReLU(),
                    nn.Conv1d(n_flatten, features_dim, kernel_size=1, stride=1, padding=0),
                    nn.ReLU(),
                    nn.Flatten(),
                )
                self.linear = nn.Sequential(nn.Linear(n_flatten, 1), nn.Tanh())
                # print("go to the NO 1:", observations)
                # print("the observations NO 1 number:", np.array(observations).shape)
                # print("output data NO 1:", self.linear(self.cnn(observations)))
                # print("output data number:", np.array(self.linear(self.cnn(observations))).shape)
                # print("this is J in {} time".format(j))

        return self.linear(self.cnn(observations))

The following is the call part.

policy_kwargs = dict(
        features_extractor_class=CustomCNN,
        #share_features_extractor=False,
        #features_extractor_kwargs=dict(features_dim=1),
        #features_dim=1,
        net_arch=dict(qf=[128, 128], pi=[256, 256])
    )
    #policy_kwargs = dict(activation_fn=th.nn.ReLU,
    #                     net_arch=[dict(pi=[32, 32], vf=[32, 32])])
    def get_model(
        self,
        model_name: str,
        policy: str = "MlpPolicy",    
        #policy: str = "MultiInputPolicy",
        policy_kwargs: dict = policy_kwargs,
        model_kwargs: dict = None,
        verbose: int = 1
    ) -> Any:

        print("set Debug!")

        if model_name not in MODELS:
            raise NotImplementedError("NotImplementedError")

        if model_kwargs is None:
            model_kwargs = MODEL_KWARGS[model_name]

        if "action_noise" in model_kwargs:
            n_actions = self.env.action_space.shape[-1]                        
            model_kwargs["action_noise"] = NOISE[model_kwargs["action_noise"]](
                mean=np.zeros(n_actions), sigma=0.1 * np.ones(n_actions)
            )
        # print(model_kwargs)    
        # print(policy, self.env)
        # print(model_name)
        # #print(self.env.observation)
        # print("dispaly: observation_space", self.env.observation_space)
        # print(self.env.daily_information_cols)

        model = MODELS[model_name](        
            policy=policy,
            env=self.env,
            tensorboard_log="{}/{}".format(config.TENSORBOARD_LOG_DIR, model_name),
            verbose=verbose,
            policy_kwargs=policy_kwargs,
            **model_kwargs
        )
        #print("model display: ", model)
        #exit()
        return model
Zero1366166516 commented 2 years ago

I would like to ask whether this algorithm can be used in real stock trading, and how to realize the operation of the next day in the real world.

YangletLiu commented 2 years ago

I would like to ask whether this algorithm can be used in real stock trading, and how to realize the operation of the next day in the real world.

Can you elaborate more? Are you asking feedback about its feasibility in a live trading task? Or would you like to plug it into a paper trading demo?

Zero1366166516 commented 2 years ago

My idea is to use it in the real world to use DRL algorithm to make portfolio, such as predicting whether positions should be adjusted tomorrow and how to use it in trading.

YangletLiu commented 2 years ago

My idea is to use it in the real world to use DRL algorithm to make portfolio, such as predicting whether positions should be adjusted tomorrow and how to use it in trading.

Could I understand it as a general question about applying DRL to real-world portfolio allocation? As far as I know, DRL is a quite powerful tool; tens of hedge funds and investment banks have deployed it.....and they got around 40% annual return over the past year. Probably not invest significant capital yet.

ElegantRL has polished the DRL algorithms and is easily adapted to finance applications. If you need algorithmic support, please let me know.

Zero1366166516 commented 2 years ago

Thank you very much for your help. Let me talk about my situation first. I am interested in quantitative investment. I am also a shareholder. I found that MLP is used as the feature extractor in the DRL model used to study finance now. I want to try CNN and LSTM as the feature extractor. Now I use the routine of Sb3 to modify a CNN feature extractor, and the effect is OK (after I use MLP as the feature extractor for backtesting, I had better get an annualized yield of 13%, and CNN had better get an annualized yield of 20%). I have two problems to solve. 1. How can the current algorithm be used for real offer operation? Because now the data that has occurred is used for back testing. I want to use the DRL algorithm in actual transactions. What should I do.

  1. I have written CNN feature extractor. I don't know whether anyone has tried it before and how the effect is. LSTM feature extractor I'm looking for information and ready to start. I can write one first. If there is an error, please study it together.
YangletLiu commented 2 years ago

You may be interested to know this repo https://github.com/AI4Finance-Foundation/FinRL-Live-Trading The team is trying to release codes for live trading. However, it may take some time.

If you like, please try to interact with members there.

Athe-kunal commented 2 years ago

Hi @Zero1366166516 I can suggest you try using RLlib. They have LSTMPolicy which you can customise easily