Closed jimrok closed 1 week ago
The operation Cal.calendar(freq='day', future=True)
yields a List[pd.Timestamp]
, aligning well with the output format of pd.to_datetime()
which produces a pd.Timestamp
. Consequently, there's no discrepancy between the following two code snippets. Through personal testing, both methods successfully retrieve identical values:
# Method 1: Utilizing pd.to_datetime for date conversion
data1 = quote.get_data('000610.SZ', pd.to_datetime('2023-01-04'), pd.to_datetime('2023-01-04'), "$close")
# Method 2: Leveraging the calendar list for date specification
data2 = quote.get_data('000610.SZ', _calendar[-181], _calendar[-181], "$close")
numpy.datetime64
, utilized within the context of NumpyQuote
, is the direct output of invoking pd.Timestamp.to_numpy()
. Hence, current time formats are in consistent.
š Bug Description
I encountered a bug while conducting a backtest with qlib. I'm not sure how to resolve it, so I'll first report where the issue arises: https://github.com/microsoft/qlib/blob/main/qlib/backtest/exchange.py#L389. Here, to query data from the quote object, a start_time parameter is required. The start_time is sourced from the trade_calendar during the backtest, and the data from the trade_calendar is read from a list of calendar files. The time construction in the calendar uses pd.Timestamp(x), which presents a problem: the pd.Timestamp type does not include nanosecond information. If this parameter is used for querying, it will result in incorrect data retrieval. Below, I will provide a simplified code snippet to illustrate this issue:
To Reproduce
Steps to reproduce the behavior: follwing code will be same result in describe the bug.
Expected Behavior
Firstly, qlib supports the smallest frequency at the minute level, so there is no need to concern ourselves with whether pd.Timestamp includes nanosecond information. However, when constructing the NumpyQuote, it retains the nanosecond information in the index, which leads to inconsistency with the time generated by the calendar. It is hoped that both sides will use a unified method for time conversion when handling pd.Timestamp.
Environment
Note: User could run
cd scripts && python collect_info.py all
under project directory to get system information and paste them here directly.Windows
,Linux
,MacOS
): WindowsAdditional Notes