SimFin / simfin

Simple financial data for Python
https://simfin.com/
Other
298 stars 39 forks source link

hub.fin_signals - cannot handle a non-unique multi-index! error #8

Closed freekeys closed 3 years ago

freekeys commented 3 years ago

Bug Report

Description

When I try to run the hub.fin_signals function I get the error below. This has suddenly started today. The same code worked fine yesterday.

System Details

--- USING GOOGLE COLAB

Code Example

Please write a minimal source-code example that reproduces the problem. You can indent the code-block to get proper code-formatting, for example:

!pip install simfin

%matplotlib inline
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from scipy.optimize import differential_evolution
from datetime import datetime, timedelta
import os

# Import the main functionality from the SimFin Python API.
import simfin as sf

# Import names used for easy access to SimFin's data-columns.
from simfin.names import *

# CONFIG
sf.set_data_dir('~/simfin_data/')
sf.set_api_key(api_key='XXXXXX')
sns.set_style("whitegrid")

# HUB CREATION
# We are interested in the US stock-market.
market = 'us'
# Add this date-offset to the fundamental data such as
# Income Statements etc., because the REPORT_DATE is not
# when it was actually made available to the public,
# which can be 1, 2 or even 3 months after the Report Date.
offset = pd.DateOffset(days=60)
# Refresh the fundamental datasets (Income Statements etc.)
# every 30 days.
refresh_days = 30
# Refresh the dataset with shareprices every 1 days.
refresh_days_shareprices = 1
hub = sf.StockHub(market=market, offset=offset, refresh_days=refresh_days, refresh_days_shareprices=refresh_days_shareprices)

# LOAD DATASETS
# Fundamental signals: Current ratio, ROA, ROE
df_fin_signals = hub.fin_signals(variant='daily')

Result / Error

Dataset "us-income-ttm" not on disk.
- Downloading ... 100.0%
- Extracting zip-file ... Done!
- Loading from disk ... Done!
Dataset "us-balance-ttm" not on disk.
- Downloading ... 100.0%
- Extracting zip-file ... Done!
- Loading from disk ... Done!
Dataset "us-cashflow-ttm" not on disk.
- Downloading ... 100.0%
- Extracting zip-file ... Done!
- Loading from disk ... Done!
Dataset "us-shareprices-daily" not on disk.
- Downloading ... 100.0%
- Extracting zip-file ... Done!
- Loading from disk ... Done!
Cache-file 'fin_signals-2a38bb7d.pickle' not on disk.
- Running function fin_signals() ... 
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-d3b3a780f45f> in <module>()
     21 # LOAD DATASETS
     22 # Fundamental signals: Current ratio, ROA, ROE
---> 23 df_fin_signals = hub.fin_signals(variant='daily')

5 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/multi.py in reindex(self, target, method, level, limit, tolerance)
   2317                     )
   2318                 else:
-> 2319                     raise ValueError("cannot handle a non-unique multi-index!")
   2320 
   2321         if not isinstance(target, MultiIndex):

ValueError: cannot handle a non-unique multi-index!
thf24 commented 3 years ago

Thanks for reporting this. It looks like there was a problem in the database for a company that was updated yesterday leading to duplicated values in the TTM files, it's fixed now. The bulk files are being updated now and then it should work again.

freekeys commented 3 years ago

Thank you for the speedy fix!