Closed Cally99 closed 2 years ago
betfairligthtweight
Slack group here@liam
on the Slack to get your live key activated for freebetfairlightweight
example using the instructions belowbetfairutil
fits into an ecosystem of Python Betfair packages that includes betfairlightweight - a Betfair API implementation written in Python - and flumine - a fully featured trading platform built on top of betfairlightweight
that can be used for live trading, back testing of strategies, and recording market data from Betfair. To clarify, I'm not the author of either of these, just a dedicated user of betfairlightweight
.
If you're not familiar with these packages I strongly suggest you take a look at those repositories, especially the examples. This one from the betfairlightweight
repository is very close to what you are trying to achieve with regards getting Premier League data into a data frame except it:
There is also documentation for both packages here and here
Finally, an incredible resource is the betfairlightweight Slack which you can join using this link. Here we discuss all things automated betting, obviously with a heavy focus on betfairlightweight
, flumine
and Python but topics of discussion also include other betting platforms, cloud service providers, strategy development, etc. etc. Betfair employees also occasionally participate.
Here I'm referring to data obtained from data purchased from the Betfair historic data service: https://historicdata.betfair.com. Some data is available for free. It is my strong recommendation that you only work with the PRO data. I realise it's costly (aside from the small number of months that are available for free) so standard practice if you're just getting started is to record your own data using flumine
. Bear in mind Betfair will frown upon this if you are not placing any bets.
Assuming you've download the market files - i.e. file type "M" on the historic data website - where each file contains data for a single market then you can read the entire thing into one (large) data frame very simply:
from betfairutil import prices_file_to_data_frame
df = prices_file_to_data_frame(path_to_prices_file)
The prices_file_to_data_frame
function takes a number of arguments that can be used to configure exactly what columns are included in the data frame:
should_output_runner_names
: by default the data frame won't include the names of the selections, only their IDs. For example, 48351
instead of Man Utd
. Set this parameter to True
if you want the namesshould_format_publish_time
: the publish time is the timestamp associated with the streaming update from Betfair. In the raw data it's an integer number of milliseconds since the Unix epoch. If you set this parameter to True
then these timestamps will get converted to a nice human readable formatmax_depth
: As noted, the data frame for a market across its entire lifetime can be very large. One way of limiting the size is to impose a restriction on the maximum depth. For example, if you set this parameter to 3
then the data frame will only contain the 3 best back and lay prices at each timestampThe rest of the parameters will be discussed in due course where relevant.
The Betfair historic data can also come in event-level files - i.e. file type "E" - where each file contains data for all markets corresponding to a single event (i.e. football match, horse race meeting, etc.). You can do exactly the same as above:
from betfairutil import prices_file_to_data_frame
df = prices_file_to_data_frame(path_to_prices_file)
and read the entire event into one data frame. This will be even larger than before so there are a few other parameters to prices_file_to_data_frame
that might be useful here:
should_output_market_types
: you can set this True
to include a column in the data frame that indicates the market type. In this case you'll almost certainly want to do this so you can identify whether a price comes from the match odds, correct score, O/U 2.5 goals etc. marketsmarket_type_filter
: this parameter lets you filter out market types you aren't interested in. For example, you might only want match odds and correct score data and discard everything else in the fileFor convenience, betfairutil
includes a function prices_file_to_csv_file
which is just a wrapper around prices_file_to_data_frame
An alternative to reading an entire prices file into one data frame is to convert the market book at each timestamp into a data frame:
from betfairutil import market_book_to_data_frame
from betfairutil import read_prices_file
market_books = read_prices_file(path_to_prices_file)
for market_book in market_books:
df = market_book_to_data_frame(market_book)
# Do something with df here
If you've recorded data yourself using flumine
then those prices files are essentially interchangeable with the ones provided by the Betfair historic data service all of the above code will work almost seamlessly. The caveat is that runner names are not present in data that has been recorded live. prices_file_to_data_frame
therefore also takes a parameter market_catalogues
which lets you pass one or more saved market catalogues that include the mapping from selection ID to runner name
As above, the best starting point for consuming live data is the betfairlightweight
example available here
To change this from GB horse racing to football, modify the market_filter
to:
market_filter = streaming_market_filter(
event_type_ids=["1"], market_types=["MATCH_ODDS"]
)
Note that it is not possible to filter by competition so you cannot restrict only to Premier League. Best practice is to use a coarse filter like above then locally filter out markets which aren't of interest.
Now in the loop that handles the market books, simply use the market_book_to_data_frame
function as above:
# check for updates in output queue
while True:
market_books = output_queue.get()
for market_book in market_books:
df = market_book_to_data_frame(market_book)
# Do something with data frame here
Hi there,
I stumbled across this repo and I'm interested in getting betfair pricing data to a pandas dataframe. I'm having limitations using delayed key and would like to use streaming.
Could you upload a simple example for instance how to get premier league match data in a dataframe via streaming? Or any league for that matter?
Thanks