gerrymanoim / exchange_calendars

Calendars for various securities exchanges.
Apache License 2.0
422 stars 132 forks source link

First session is alway 20 years in the past #405

Open pjmcdermott opened 1 month ago

pjmcdermott commented 1 month ago

Today:


Python 3.10.12 (main, Mar 22 2024, 16:50:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import exchange_calendars
>>> xlon=exchange_calendars.get_calendar("XLON")
>>> xlon.first_session
Timestamp('2004-07-22 00:00:00')

But when I did this yesterday I got:

>>> xlon.first_session
Timestamp('2004-07-21 00:00:00')

First session appears to always be 20 years in the past.

I can understand (and my assumption was) that there might be a constraint to first_session based on the available data (and that this may vary by exchange). But this response suggests that first_session is always driven by an arbitrary 20 year window.

This is an issue for anyone doing longer-run analysis/research. It also suggests that this package isn't making the most of the historical data that is does have available.

The definition of first_session in exchange.py:

    @property
    def first_session(self) -> pd.Timestamp:
        """First calendar session."""
        return self.sessions[0]

suggests that self.sessions is a list of timestamps. While there may be an argument for not generating too large a list I feel that this could be left in the control of the user by allowing them to specify and overall session start date (or a start and end date, or a window duration) before building this sessions list.

Alternatively, first_session could return a computed first session date based on the available data per exchange, rather than returning the value of a pre-computed list.

gnzsnz commented 1 month ago

this is causing problems for Cboe VX futures to give a concrete example, the data series start on May 2004, I can only use Aug-2004 and onwards with this limitation. every month, a month of historical data won't work

gnzsnz commented 1 month ago

I went through the code and i found a solution

cal = xcals.get_calendar(EXCHANGE,start=CAL_START)

the default start is today minus 20Y, end today plus 1 year. you can tune that using get_calendar start and end parameter

maread99 commented 1 month ago

Hi @pjmcdermott

As you've noticed, by default the calendars that are not hard-coded have a start date of twenty years prior to the current day. However, the calendar start date can be set by simply passing the "start" option. For example:

import exchange_calendars as xcals
xlon = xcals.get_calendar("XLON", start="2000-01-01")
xlon.first_session

returns...

Timestamp('2000-01-04 00:00:00')

The end date can be similarly set by passing the "end" option.

It should be noted that the likelihood of inaccuracies will increase the earlier the start date, simply because ad-hoc holidays and changes to holidays that far back may not have been coded in. Some calendars are more complete than others in this respect - to see the earliest date from which a calendar has been actively defined it's a matter of inspecting the calendar's file (for example, exchange_calendar_xlon.py). If you find something's missing for a calendar that you're using then it's simply a matter of offering a PR with any required changes.

Cheers Marcus

pjmcdermott commented 1 month ago

Hi @maread99,

Thanks for your explanation: that mechanism is similar to what I was suggesting and is quite useful.

I will go with an early date and look out for any discrepancies.

Regards, Paul

pjmcdermott commented 1 month ago

One follow-up observation:

Some calendars seem to have a 'hard' start date, e.g. XTKS below:

>>> import exchange_calendars
>>> xlon=exchange_calendars.get_calendar("XLON", start="1980-01-01")
>>> xtks=exchange_calendars.get_calendar("XTKS", start="1980-01-01")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/paul/dev/python/Trading/.direnv/python-3.10.12/lib/python3.10/site-packages/exchange_calendars/calendar_utils.py", line 300, in get_calendar
    return cached if cached is not None else self._fabricate(name, **kwargs)
  File "/home/paul/dev/python/Trading/.direnv/python-3.10.12/lib/python3.10/site-packages/exchange_calendars/calendar_utils.py", line 193, in _fabricate
    calendar = factory(**kwargs)
  File "/home/paul/dev/python/Trading/.direnv/python-3.10.12/lib/python3.10/site-packages/exchange_calendars/exchange_calendar.py", line 309, in __init__
    raise ValueError(self._bound_min_error_msg(start))
ValueError: The earliest date from which calendar XTKS can be evaluated is 1997-01-01 00:00:00, although received `start` as 1980-01-01 00:00:00.

It is possible to get this 'hard' start date, but one needs to create a calendar object first, so in order to get the longest range possible I have to do:

>>> import exchange_calendars
>>> xtks=exchange_calendars.get_calendar("XTKS")
>>> start_date=xtks.bound_min().date().isoformat()
>>> start_date
'1997-01-01'
>>> xtks=exchange_calendars.get_calendar("XTKS", start=start_date)

Thus creating an object instance twice.

It would be better if the limits to the start dates were available before creating a calendar object instance.