The Docs explain the difference in the volume reported by CME (and Yahoo), and AlgoSeek data:
During each trading session, CME reports the daily volume for the previous trading session.
This event is present in our datasets Futures TAQ and Futures Trade Only with event type 'TRADE VOLUME'.
This means that the data contains some lines with this information. E.g.:
20230925132650085,629202,710431,"CL","CL","CLX3",42,,0,402087,0,0,0,TRADE VOLUME FINAL
Which is not a tick, since it doesn't represent a trade, quote, or open interest.
However, that value reported by CME takes into account pit trades that are not present in the feed.
So, as pit trades are not included in the dataset you can not get the same daily volume by counting all events as CME.
Also, according to the CME specification, the trading session starts at 5 p.m. CT and ends at 4 p.m. CT on the following day.
So, in your calculation of daily volume, you should sum quantities between 10 p.m. UTC on the 25th and 9 p.m. UTC on the 26th.
The focus is the feed vs pit volume. Note that pit is over-the-counter, so it shouldn't be considered for backtests, because algorithms cannot trade this volume.
Actual Behavior
This difference is undocumented, so members think this is a data issue.
Checklist
[x] I have completely filled out this template
[x] I have confirmed that this issue exists on the current master branch
[x] I have confirmed that this is not a duplicate issue by searching issues
Expected Behavior
The Docs explain the difference in the volume reported by CME (and Yahoo), and AlgoSeek data:
This means that the data contains some lines with this information. E.g.:
20230925132650085,629202,710431,"CL","CL","CLX3",42,,0,402087,0,0,0,TRADE VOLUME FINAL
Which is not a tick, since it doesn't represent a trade, quote, or open interest.
We can add this information on the US Futures Dataset page, and code gen will add this information to the Docs' US Futures Dataset page.
The focus is the feed vs pit volume. Note that pit is over-the-counter, so it shouldn't be considered for backtests, because algorithms cannot trade this volume.
Actual Behavior
This difference is undocumented, so members think this is a data issue.
Checklist
master
branch