Each timestep in the LOB is a second from markets open at 8:00 to close at 16:30 – London Stock Exchange. Max value is about 30600s, which equates to 8.5 hours, the length of the market open.
The LOB data timestamp is not the order's timestamp but the timestamp when the book is updated. The data is all of the orders present at that point.
Each LOB file is for a different day, therefore when we clean the data we need to convert the time stamp to the corresponding day and from seconds to the corresponding time.
Do we need to add a timestamp for each of the orders? As well as the time of LOB
Identify the spread at each timestamp
Should we be separating the orders and ranking them at each timestamp, so each order contains the date, an order book timestamp, an order timestamp, an order level (ranking), bid/ask, quantity (is this in the scale of 100s), price, exchange, bidID?
My Notes on the Data and Ideas from Notes in Confluence
https://dsmp.atlassian.net/wiki/spaces/DSMP/pages/1638426