polygon-io / issues

Quickly track and report problems with polygon.io
29 stars 0 forks source link

Aggregates (Per Second) Websocket Data (OHLC) doesn't match other Polygon data. #180

Closed digitalml closed 1 year ago

digitalml commented 2 years ago

Client Which WebScoket client are using? I am using the Aggregates (Per Second) Websocket.

Issue If I take all the streamed data from the per second websocket for 1 minute of time and compare the data to the REST and Websocket Aggregate Bar 1 minute endpoints the OHLC values for the given time frame don't match. An example would be for IMTE today at 8:34am (PST). The REST and Websocket Aggregate Bar 1 minute endpoint (and other platforms, tradingview, lightspeed, thinkorswim, webull) all report the high of the 8:34am (PST) bar as $5.965 ($5.97 rounded). The data from the Aggregates (Per Second) Websocket only ever shows a high of $5.955 ($5.96 rounded). This happens consistently through the day on all symbols. The mismatch seems to be on any of the fields (OHLC)

Expected Result Aggregates (Per Second) Websocket cumulative data should match

Screenshots Here is an excel file of the mismatch I reference in this bug: https://www.dropbox.com/scl/fi/8abyb1lp2xhb7sbj0ihmt/polygon-aggregates-error.xlsx?dl=0&rlkey=waiprqhfghjmwiuhv35rc5j5h

Desktop (please complete the following information):

digitalml commented 2 years ago

I was asked to provide more info on this issue by Jack so I'll post that here for posterity.

This happens with A.* or with A.SYMBL and can be reproduced at any time.

I've created another sample of data for todays runner NTRB. This shows the A.* data for NTRB for a 1 minute period (6:55am PST) and compared to the 1 minute REST / Websocket data. The OPEN, HIGH, and LOW data is all incorrect. The CLOSE matches.

I've noticed in all my data samples though that everything is off by a penny-ish.. I'm not doing any rounding at all on my end I'm simply outputting what comes through the socket. I would have to surmise that this looks like a floating point rounding issue on the Aggregates (Per Second) Websocket.

https://www.dropbox.com/scl/fi/tzardkyz9k2wcnvzwa0jj/polygon-aggregates-error-NTRB-12-31.xlsx?dl=0&rlkey=agbfuzi90yhexlha1uf9g3s1d

jrbell19 commented 2 years ago

Thanks for posting the additional information. We are still digging into this, will circle back once I have some additional insight.

digitalml commented 2 years ago

It's been 2+ weeks now on this.. So just wondering. Has it been fixed or is it close?

gyorgyszucs commented 1 year ago

Hello @jrbell19, I've found an issue, and I think it is belonging on this thread, so let me post what I've found. Although this is about the difference between trades and aggregates:

On December 7, Netflix had a trade price at 3:29:59 PM EST for $307.9288 that is not included on the appropriate 5 minute aggregate (see below). I've encountered this issue before, but for those times the price difference was huge so those are counted as price spikes for me and filtered out accordingly. However, here the price difference is low and also we are 1 second before the next candlestick so it looks like a bug to me.

Here are the details:

You can see it in this trade query (search for 1670444999091000000): https://api.polygon.io/v3/trades/NFLX?timestamp.gte=2022-12-07T20:29:00Z&order=asc&limit=50000&apiKey=b6yhLuQ56Opc4b7CIu8Exp3k1A33lIeC

"conditions":[37],
"exchange":4,
"id":"33341",
"participant_timestamp":1670444999091000000, --> December 7, 2022 3:29:59.091 PM EST
"price":307.9288,
"sequence_number":4356751,
"sip_timestamp":1670444999114454309, --> December 7, 2022 3:29:59.114 PM EST
"size":1,
"tape":3,
"trf_id":202,
"trf_timestamp":1670444999114424018

Aggregates query (search for 1670444700000): https://api.polygon.io/v2/aggs/ticker/NFLX/range/5/minute/2022-12-07/2022-12-08?adjusted=false&sort=asc&limit=50000&apiKey=b6yhLuQ56Opc4b7CIu8Exp3k1A33lIeC

"v":42144,
"vw":307.519,
"o":307.42,
"c":307.89,
"h":307.89,
"l":307.2,
"t":1670444700000, --> December 7, 2022 3:25:00.000 PM EST
"n":915

The "h" should be 307.9288. I hope this helps, I'm relying on aggregates heavily in my calculations, and having incorrect bars is worrying.

jcho-polygon commented 1 year ago

Hi @gyorgyszucs,

This is expected behavior. This was an odd lot trade (condition 37), making it not eligible to update the High price. You can use our Conditions API to check whether a trade's condition(s) make it eligible to update specific aggregate values: https://api.polygon.io/v3/reference/conditions?asset_class=stocks&id=37&apiKey=*

While there is no de facto ruleset for calculating aggregate values, we use the guidelines provided by the SIPs for our aggregate (OHLCV) values. These "trade eligibility" rules are outlined in our Conditions API, as well as our article here: https://polygon.io/blog/understanding-trade-eligibility/

gyorgyszucs commented 1 year ago

Thank you @jcho-polygon ! πŸŽ‰πŸŽ‰πŸŽ‰πŸ‘πŸ‘πŸ‘