unkuseni / rs_smm

0 stars 0 forks source link

mid_price_regression #16

Open jpmediadev opened 1 week ago

jpmediadev commented 1 week ago

Hi Olaseni! I’ve had the chance to look through your project, and I must say it’s really interesting! I appreciate the effort that has gone into incorporating such a detailed and structured approach. I’m curious—what resources or authors have inspired you in developing this project? It would be great to hear more about your influences and how you arrived at your design decisions.

That being said, I have a few observations about the current approach that I hope might be helpful:

pub fn mid_price_regression( mid_price_array: Array1, features: Array2, // imbalance_ratio, voi, ofi curr_spread: f64, ) -> Result<f64, String> { // Normalize features if needed let normalized_features = features.map(|&x| x / curr_spread);

// Create the dataset
let dataset = Dataset::new(normalized_features, mid_price_array);

// Create and fit the model
let model = LinearRegression::default()
    .fit(&dataset)
    .map_err(|e| format!("Failed to fit the model: {}", e))?;

// Make predictions
let predictions = model.predict(&dataset);

// Return the mean of the predictions
Ok(predictions.mean().unwrap_or(0.0))

}

  1. Using features for smoothing seems unnecessary: If the goal is just to smooth mid prices, it might be simpler to work directly with the mid_prices array and apply something like a moving average (SMA, EMA) without involving features such as imbalance_ratio, VOI, or OFI. These features are useful for prediction but don’t seem relevant for a smoothing operation.
    1. Prediction requires new feature values: If the intent is to use regression for prediction, the model should be trained on historical data and later used to predict future values based on new features. Currently, the regression model is retrained each time using the same historical data, which doesn’t provide any meaningful forecast. Ideally, the model should be trained once, and new feature values should be fed in to make predictions for future mid prices.
    2. Re-training every time seems redundant: The way the regression model is currently re-trained on the same historical data is more akin to a complex way of smoothing rather than actual prediction. If the goal is smoothing, simpler methods like a moving average would suffice. If the goal is prediction, new feature inputs should be passed into a pre-trained model to forecast future prices.

I think a slight adjustment in how the regression is used could bring the code closer to achieving its intended purpose. Thanks again for sharing your project—it’s definitely promising, and I’m excited to see how it evolves!

unkuseni commented 4 days ago

I'm so glad you commented about cause I have been looking into some papers and found that linear regression seems to be most simple way of predicting future prices with features then I had to go learn...I am still a noob and testing things out but if you have any ideas, suggestions on how I can improve please don't hesitate to let me know. Some of the papers I read were written by Stoikov,Cartea, Avellanda but I'll find the doi of each paper and add them to the next comment

If you have some free time, I'd also like to pick your brain on how you'd decide order replacement/amends depending on fills, time, order book events/volatility

unkuseni commented 4 days ago

Some of the books/papers are: Algorithmic and High Frequency trading by Alvaro Cartea, Sebastian Jaimungal, Jose penalva Enhancing trading strategies with order book signals by Álvaro Cartea, Ryan Donnelly & Sebastian Jaimungal #[https://doi.org/10.1080/1350486X.2018.1434009] Marketmaking with alpha signal by Cartea & Wang https://doi.org/10.1142/S0219024920500168 COMPARISON OF DIFFERENT MARKET MAKING STRATEGIES FOR HIGH FREQUENCY TRADERS by Yibing Xiong, Takashi Yamada, Takao Terano Spoofing and Manipulating Order Books by Álvaro Cartea, Patrick Chang, Gabriel Garcı́a-Arenasa, *[I was hoping this would increase fill rates with bad inventory] The price impact of order book events by Rama Cont, Arseniy Kukanov and Sasha Stoikov Order Imbalance Based Strategy in High Frequency Trading by Darryl shen

jpmediadev commented 3 days ago

Thanks for sharing the list, I’m also searching for solutions, and while the papers are full of complex mathematical formulas, seeing your implementation in code makes them much easier to understand.

I’ve tried using more advanced ML models in the past, but it quickly turned into a black hole, with the complexity growing exponentially, and it became unclear what exactly wasn’t working. So, I switched back to simpler models with the idea of gradually making them more sophisticated.

At the moment, I’m using price data from several exchanges as a predictor for the mid-price and placing a grid of orders based on that. However, I’m considering ways to make the algorithm smarter by adding short-term prediction.

unkuseni commented 3 days ago

I might just consider your idea of aggregating data from multiple exchanges but how do you weight the exchanges considering liquidity

jpmediadev commented 3 days ago

I choose top exchanges by liquidity, and I choose, for example, 10 ask bid levels on each and weigh the price according to the volume, but in fact there are many options for creativity, and everything depends on the properties of the asset

unkuseni commented 22 hours ago

Thanks, I'll let know you how it turns out