matplotlib / mplfinance

Financial Markets Data Visualization using Matplotlib
https://pypi.org/project/mplfinance/
Other
3.48k stars 613 forks source link

Adding ConciseDateFormatter displays dates from 1970 #643

Closed BennyThadikaran closed 10 months ago

BennyThadikaran commented 10 months ago

I am trying to add ConciseDateFormatter to mplfinance chart. Below is the working code but the dates are displaying incorrectly. Its showing dates from 1970.

I tried the same thing in matplotlib and the dates display correctly. (See commented code).

I not sure how to get this working?

Python 3.10.12 (Linux Mint 21.2) matplotlib 3.7.2 mplfinance 0.12.10b0

import matplotlib.pyplot as plt
from mplfinance import plot, show
from matplotlib.dates import AutoDateLocator, ConciseDateFormatter
from pandas import DataFrame, to_datetime
from json import loads

data = loads('{"Open":{"2023-07-24T00:00:00.000":1678.5,"2023-07-25T00:00:00.000":1684.65,"2023-07-26T00:00:00.000":1699.6,"2023-07-27T00:00:00.000":1699.9,"2023-07-28T00:00:00.000":1661.5,"2023-07-31T00:00:00.000":1650.05,"2023-08-01T00:00:00.000":1654.45,"2023-08-02T00:00:00.000":1642.0,"2023-08-03T00:00:00.000":1640.0,"2023-08-04T00:00:00.000":1635.15,"2023-08-07T00:00:00.000":1663.1,"2023-08-08T00:00:00.000":1651.7,"2023-08-09T00:00:00.000":1653.0},"High":{"2023-07-24T00:00:00.000":1684.65,"2023-07-25T00:00:00.000":1699.0,"2023-07-26T00:00:00.000":1699.6,"2023-07-27T00:00:00.000":1703.0,"2023-07-28T00:00:00.000":1668.9,"2023-07-31T00:00:00.000":1656.8,"2023-08-01T00:00:00.000":1667.45,"2023-08-02T00:00:00.000":1651.5,"2023-08-03T00:00:00.000":1651.35,"2023-08-04T00:00:00.000":1656.5,"2023-08-07T00:00:00.000":1663.1,"2023-08-08T00:00:00.000":1655.6,"2023-08-09T00:00:00.000":1654.5},"Low":{"2023-07-24T00:00:00.000":1670.1,"2023-07-25T00:00:00.000":1678.4,"2023-07-26T00:00:00.000":1688.0,"2023-07-27T00:00:00.000":1667.45,"2023-07-28T00:00:00.000":1641.1,"2023-07-31T00:00:00.000":1638.7,"2023-08-01T00:00:00.000":1650.0,"2023-08-02T00:00:00.000":1633.15,"2023-08-03T00:00:00.000":1623.0,"2023-08-04T00:00:00.000":1629.25,"2023-08-07T00:00:00.000":1647.55,"2023-08-08T00:00:00.000":1642.05,"2023-08-09T00:00:00.000":1631.1},"Close":{"2023-07-24T00:00:00.000":1678.4,"2023-07-25T00:00:00.000":1696.6,"2023-07-26T00:00:00.000":1690.7,"2023-07-27T00:00:00.000":1673.15,"2023-07-28T00:00:00.000":1643.5,"2023-07-31T00:00:00.000":1651.2,"2023-08-01T00:00:00.000":1662.25,"2023-08-02T00:00:00.000":1640.5,"2023-08-03T00:00:00.000":1628.65,"2023-08-04T00:00:00.000":1652.2,"2023-08-07T00:00:00.000":1651.25,"2023-08-08T00:00:00.000":1649.9,"2023-08-09T00:00:00.000":1650.5},"Volume":{"2023-07-24T00:00:00.000":16089722.0,"2023-07-25T00:00:00.000":27996298.0,"2023-07-26T00:00:00.000":12397179.0,"2023-07-27T00:00:00.000":29870651.0,"2023-07-28T00:00:00.000":20507842.0,"2023-07-31T00:00:00.000":17282503.0,"2023-08-01T00:00:00.000":17697094.0,"2023-08-02T00:00:00.000":14058161.0,"2023-08-03T00:00:00.000":28836973.0,"2023-08-04T00:00:00.000":18694152.0,"2023-08-07T00:00:00.000":14150459.0,"2023-08-08T00:00:00.000":21886914.0,"2023-08-09T00:00:00.000":16680618.0}}')

df = DataFrame(data)
df.index.name = 'Date'
df.index = to_datetime(df.index)

locator = AutoDateLocator(minticks=12, maxticks=30)
formatter = ConciseDateFormatter(locator)

# Code for Matplotlib
# ax = plt.subplot()

# ax.xaxis.set_major_locator(locator)
# ax.xaxis.set_major_formatter(formatter)

# plt.plot(df.index, df['Close'])

# plt.show()

# Code for Mplfinance

fig, ax = plot(df, type='candle', style='tradingview',
               figscale=2, returnfig=True)

ax[1].xaxis.set_major_locator(locator)
ax[1].xaxis.set_major_formatter(formatter)

show()

mplfinance matplotlib

DanielGoldfarb commented 10 months ago

Try setting show_nontrading=True when calling `mpf.plot().

See Mplfinance Time Axis Concerns for more information.

If you need to leave show_nontrading=False (the default value when unspecified) it is likely that you can accomplish the date format you want even without ConiseDateFormatter. Try using the mpf.plot() kwarg datetime_format=. You can set this kwarg to any valid strftime() style format string. This may be a simple way to accomplish what you are trying to do, and you can do it without the need for returnfig=True nor to interact with the Axes object(s).

If you need show_nontrading=False, and you still insist on having the ConciseDateFormatter then you may have to write your own data formatter that first translates from row numbers to actual matplotlib dates, and then uses those dates for the ConsiseDataFormatter. If you need guidance on this, let me know.

BennyThadikaran commented 10 months ago

show_nontrading is set to False by default. I tried explicitly setting it but it doesn't make any difference. I am already using datetime_format in my repo code, it's not the same as the conciseDateFormatter.

But your explanation and the link you provided, helped me understand the problem. The AutoDateLocator is using the row numbers and treating it as unix timestamps. Its explains the dates from 1970.

So i implement my own AutoDateLocator? Looking at the source code, I have to inherit the DateLocator class. Am i in the right direction?

Thank you for taking the time to answer my question. :smile:

DanielGoldfarb commented 10 months ago

Setting show_nontrading=True should definitely make a difference.

Try also (in addition to show_nontrading=True) using ax[0], or both ax[0] and ax[1], when setting the locator and formatter:

# try this first:
ax[0].xaxis.set_major_locator(locator)
ax[0].xaxis.set_major_formatter(formatter)

or

# or maybe this:
ax[0].xaxis.set_major_locator(locator)
ax[0].xaxis.set_major_formatter(formatter)
ax[1].xaxis.set_major_locator(locator)
ax[1].xaxis.set_major_formatter(formatter)

Regarding setting datetime_format= kwarg ... if you are calling .set_major_formatter() then datetime_format() may be ignored. Not completely sure but I would have to check the code.

When I have a little more time (perhaps later today) I may try playing with the code myself; and can also look into the details of writing your own locator and/or formatter and get back to you on that.

BennyThadikaran commented 10 months ago

show_nontrading=True works in both of those conditions you mentioned. But it makes the chart rather ugly :smile:

You can see the chart image from my project. I have set a datetime_format and rotation on the date. While I'm satisfied with the output, the conciseDateFormatter will maximize the real estate on the chart.

I've been going through the source code. The crux of AutoDateLocator is the dunder call method and the get_locator method. I can just play with the outputs. Once i learn enough, i should be able to implement a class to make this work. This isn't urgent just an aesthetic change.

If its OK with you, i can close this issue for the time being and post a solution once i have one.

DanielGoldfarb commented 10 months ago

The following changes to your code should work. These changes utilize the fact that the x-axis is row numbers under the hood, and we translate those row numbers to datetimes before passing them to the ConciseDateFormatter. In summary:

  1. AutoDateLocator is no longer needed. Use MaxNLocator instead.
  2. import date2num (to convert python datetimes to matplotlib datetimes)
  3. import MaxNLocator
  4. use MaxNLocator (instead of AutoDateLocator)
  5. Define your own formatter class, derived from ConciseDateFormatter. This new formatter takes the list of datetimes upon construction, to be used later to convert row numbers to datetimes.
  6. Usurp the format_ticks() method of ConciseDateFormatter to first convert the row numbers to matplotlib dates, and then pass them to the parent ConciseDateFormatter.format_ticks() method.

Hope that helps.

ConciseDateFormatter (1)

BennyThadikaran commented 10 months ago

Hi Daniel,

I tried your code and it's working perfectly. I was hoping to solve this over the next few weeks, but you spent your precious time to solve this for me.

Thank you so much for your time. I will study the code and implement it.

DanielGoldfarb commented 10 months ago

@BennyThadikaran Benny, You're welcome. It was actually a lot of fun to figure out how to do this. Took me about an hour and a half of experimenting with the code; was totally worth it. I learned some neat stuff. For example, most formatters work via the __call__ method; but ConciseDateFormatter works via the format_ticks method (which I only discovered after about 45 minutes of playing with the code). In retrospect it makes sense: most formatters format each tick independently of all other ticks, but ConciseDateFormatter needs to be aware of all the ticks at the same time, because the formatting of some ticks depends in part on the formatting of others.

I'm glad it helped. --Daniel

BennyThadikaran commented 10 months ago

@DanielGoldfarb Hi Daniel,

Just wanted to post an update about my final solution. I tried playing with various configurations of the MaxNLocator but wasn't quite satisfied with end result. I ended up rolling my own custom class DateTickFormatter using the FixedLocator and FixedFormatter. It has a single public method getLabels which returns a tuple with the initialized locator and formatter. I picked up some inspiration from the AutoDateLocator to work it out.

It is not an efficient solution and currently works with daily and weekly timeframes. I only plot about 140 to 200 candles, so any performance issues are barely noticeable. I might use rrule in the future to avoid looping the entire length of the dates.

To use it:

locator, formatter = DateTickFormatter(df.index, tf='weekly').getLabels()

for ax in axs:
    ax.xaxis.set_major_locator(locator)
    ax.xaxis.set_major_formatter(formatter)

show()

I have added the complete DateTickFormatter code at the bottom, if it helps others.

ticker

That said, i still think mplfinance should work well with the ConciseDateFormatter. I suspect somewhere in the code, mplfinance is calling date2num on matplotlib date values while passing the dates to the locator classes. I tried backtracking to find the source of the problem, but had to give up after sometime. :smile_cat: I'll probably keep trying till i find the issue.

I want to thank you again for the time you spend answering these questions. I learned a lot, and feel more confident delving into matplotlib source code.

DateTickFormatter.py

from matplotlib.ticker import FixedFormatter, FixedLocator

class DateTickFormatter:
    def __init__(self, dates, tf='daily'):
        '''Dates: DatetimeIndex
        tf: daily or weekly'''

        self.dates = dates
        self.len = len(dates)
        self.month = self.year = None
        self.idx = 0
        self.intervals = (2, 4, 7, 14)
        self.tf = tf

    def _formatDate(self, dt):
        '''Returns the formatted date label for the ticker.'''

        if dt.month != self.month:
            self.month = dt.month

            if dt.year != self.year:
                self.year = dt.year
                return f'{dt:%d\n%Y}'

            return f'{dt:%d\n%b}'.upper()

        return dt.day

    def _getInterval(self):
        '''Returns an integer interval at which the ticks will be labelled.'''

        idx = 0
        while True:
            if idx == len(self.intervals) - 1:
                return self.intervals[idx]

            d = self.len / self.intervals[idx]

            if d <= max(self.intervals):
                return self.intervals[idx]
            else:
                idx += 1

    def getLabels(self):
        '''Returns an instance of FixedLocator and FixedFormatter in a tuple.
        Ticker format based on number of candles in Data.
        '''

        if self.year is None:
            self.year = self.dates[0].year
            self.month = self.dates[0].month

        if self.len <= 22:
            return self._daily()

        if self.len < 200:
            return self._atInterval(self._getInterval())

        return self._monthly()

    def _daily(self):
        '''Labels ticks on every candle'''

        labels = []

        for dt in self.dates:
            if self.tf == 'daily' and dt.weekday() > 4:
                continue

            labels.append(self._formatDate(dt))

        return (FixedLocator(tuple(range(self.len))), FixedFormatter(labels))

    def _monthly(self):
        '''Labels ticks on 1st Candle of every month and year'''

        labels = []
        ticks = []

        for i, dt in enumerate(self.dates):
            if dt.month != self.month:
                self.month = dt.month

                if dt.year != self.year:
                    self.year = dt.year
                    labels.append(dt.year)
                else:
                    labels.append(f'{dt:%b}'.upper())

                ticks.append(i)
            elif i == 0:
                labels.append(f'{dt:%b\n%Y}'.upper())
                ticks.append(i)

        return (FixedLocator(ticks), FixedFormatter(labels))

    def _atInterval(self, interval):
        '''Labels ticks at every interval of candle dates'''

        labels = []
        ticks = []
        nextTick = interval

        for i, dt in enumerate(self.dates):
            if i == 1:
                labels.append(self._formatDate(dt))
                ticks.append(i)
            elif i == self.len - 1:
                break
            elif i == nextTick:
                ticks.append(i)
                labels.append(self._formatDate(dt))
                nextTick += interval

            i += 1

        return (FixedLocator(ticks), FixedFormatter(labels))
DanielGoldfarb commented 10 months ago

@BennyThadikaran Benny, Thanks for sharing. That looks really good! All the best. --Daniel

BennyThadikaran commented 5 months ago

I just wanted to provide an update. I tried to figure out a fix for this issue. The core issue is that num2date function is being called twice on the existing matplotlib dates. Once within the mpf.plot function and thereafter when mpf.show is called after adding Locator and Formatter. Since num2date is being called on already converted timestamps we see the weird dates from 1970s. I couldnt find the exact source or a solution, but i did manage a workaround to using ConciseDateFormatter

Posting it here, so others may find it helpful.

I did have to create a custom format_coords function, otherwise works as expected.

import mplfinance as mpf
import matplotlib.dates as mdates
import matplotlib.ticker as ticker
import pandas as pd

# Sample OHLC data
data = {
    "date": pd.date_range(start="2022-01-01", end="2022-01-10", freq="D"),
    "open": [100, 110, 95, 105, 98, 100, 110, 95, 105, 98],
    "high": [120, 115, 100, 110, 105, 120, 115, 100, 110, 105],
    "low": [90, 105, 90, 98, 92, 90, 105, 90, 98, 92],
    "close": [110, 100, 92, 100, 100, 110, 100, 92, 100, 100],
}

df = pd.DataFrame(data)
df.set_index("date", inplace=True)

# Create the mplfinance chart
fig, axs = mpf.plot(
    df,
    type="candle",
    style="tradingview",
    title="OHLC Chart",
    returnfig=True,
    xrotation=0, # no rotation required
)

# Locator sets the major tick locations on xaxis
locator = mdates.AutoDateLocator(minticks=3, maxticks=7)

# Formatter set the tick labels for the xaxis
concise_formatter = mdates.ConciseDateFormatter(locator=locator)

# Extract the tick values from locator.
# These are matplotlib dates not python datetime
tick_mdates = locator.tick_values(df.index[0], df.index[-1])

# Extract the ticks labels from ConciseDateFormatter
labels = concise_formatter.format_ticks(tick_mdates)

ticks = []

# Convert the matplotlib dates to python datetime and iterate
for dt in mdates.num2date(tick_mdates):
    # remove the timezone info to match the DataFrame index
    dt = dt.replace(tzinfo=None)

    # Get the index position if available
    # else get the next available index position
    if dt in df.index:
        idx = df.index.get_loc(dt)
    else:
        idx = df.index.searchsorted(dt, side="right")

    # store the tick positions to be displayed on chart
    ticks.append(idx)

# Initialise FixedFormatter and FixedLocator
# passing the tick labels and tick positions
fixed_formatter = ticker.FixedFormatter(labels)
fixed_locator = ticker.FixedLocator(ticks)

fixed_formatter.set_offset_string(concise_formatter.get_offset())

for ax in axs:
    ax.xaxis.set_major_locator(ticker.FixedLocator(ticks))
    ax.xaxis.set_major_formatter(fixed_formatter)
    ax.format_coord = format_coords

mpf.show()