QuantConnect / Lean.DataSource.SDK

6 stars 15 forks source link

ToolBox Addition: IVolatility equity and options converter #19

Open feribg opened 6 years ago

feribg commented 6 years ago

Just want to get the conversation going on IVolatility data source converter. I already started some preliminary work but there are a few questions that come up. I only have minute equity and options data so that's the timeframe I will be implementing and testing at.

Their data comes in gzipped CSV, with underlying type (index, stock, etf), price bid, price ask, price last, date_bid, date_ask, date_last, size_bid, size_ask, size_last, exchange_bid, exchange_ask, exch_last, volume, row per minute. Not split and dividend adjusted (but dividend, corp actions, interest rate, etc) data is provided for each trading day.

Options are pretty much identical but also come with greeks and IV, type, expiration, etc.

I had a look at the Algo Seek implementations but a couple of questions come up:

  1. How to handle the stock adjustments, should I be importing the splits and dividend data somewhere or use another data source for that?
  2. Since the data is bid/ask I assume we need to dump to QuoteBar, however that needs OHLC, but we only have 1 value, should I just replicate that 4 times, or there's another approach to take here?
  3. Any way to save the IV and greeks, similarly to how open interest dedicated file is generated per day, so those are not regenerated every time?
jaredbroad commented 6 years ago

Welcome @feribg !

How to handle the stock adjustments, should I be importing the splits and dividend data somewhere or use another data source for that?

LEAN philosophy is that the raw data is as dumb as possible; so LEAN can morph the data as it needs at runtime. So splits and dividends aren't included at this level yet (i.e. just parse the raw exchange output). We're working on how to handle splits/dividends at the moment as it wasn't as simple as we believed in the current implementation.

Since the data is bid/ask I assume we need to dump to QuoteBar, however that needs OHLC, but we only have 1 value, should I just replicate that 4 times, or there's another approach to take here?

Good question. Unfortunately yes it'll need to be duplicated 4x; please test if you can just set the close- if the parsers/fill models/regression tests still work we could try just setting 1 value.

Any way to save the IV and greeks, similarly to how open interest dedicated file is generated per day, so those are not regenerated every time?

We opted to start with generating these at runtime but will add in an optimization for this later. Relative to the cost of synchronizing all the option data the greek calcs don't slow things down yet.

feribg commented 6 years ago

You can have a look at the equities importer. Im not sure how to preserve bid ask information as that's quite important, but hopefully you can provide some insight. Let me know if you want some sample data to try it out.

https://github.com/QuantConnect/Lean/pull/1527

feribg commented 6 years ago

Also planning to add FactorFileGenerator implementation as part of the importer for Ivol since they provide those in a nice csv format that's already synced with the rest of the data, so it's easier to use theirs than Yahoo's. I will update the ticket and the PR when that's done.

jaredbroad commented 6 years ago

Thank you Feras,

The exchange in this case would be "usa"; its a market field which is used to distinguish fungible assets.

On Feb 3, 2018 17:46, "Feras Salim" notifications@github.com wrote:

Also planning to add FactorFileGenerator implementation as part of the importer for Ivol since they provide those in a nice csv format that's already synced with the rest of the data, so it's easier to use theirs than Yahoo's. I will update the ticket and the PR when that's done.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/QuantConnect/Lean.DataSource.SDK/issues/19, or mute the thread https://github.com/notifications/unsubscribe-auth/ACI6mYtAxV0hSSzuheSIfwCLzeRmGHWuks5tROGygaJpZM4RjqNr .

feribg commented 6 years ago

Thanks.

Factor support is pretty much done, but there's one main issue. FactorFileGenerator has a method called ReadDailyEquityData, which assumes you have generated daily data, even though we already have minute data. How would you recommend to go about that, regenerate factors after we generate all minute data, then downsample to daily resolution and then build factors, or refactor the FactorGenerator to add support for lower res data.

feribg commented 6 years ago

@jaredbroad Just wondering if you had any thoughts on that one ?

mchandschuh commented 6 years ago

@feribg -- I believe all we really need is the previous day's closing price. We could abstract this away from the generator via an IClosingPriceProvider or similar. The implementation could decide how to fetch the closing price for a particular date. In your case, it could search through the minute data to find the closing price (time stamp in file would be 3:59PM since start times of bars are in the file).

If we end up providing this abstraction, please submit this as a separate, isolated pull request.

feribg commented 6 years ago

@mchandschuh Got it, thanks! I finalized the current PR, I will make this and the Factor generator as a separate PR to come.