sharebook-kr / pykrx

KRX 주식 정보 스크래핑
693 stars 240 forks source link

JupyterNote Book에서 index 함수에서 에러 발생 #22

Closed mr-yoo closed 4 years ago

mr-yoo commented 4 years ago

Pycharm에서는 정상동작하지만 JupyterNoteBook에서 에러 메시지 출력

from pykrx import stock 

df = stock.get_index_ohlcv_by_date("20200101", "20200831", "코스피")
print(df.head())
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_value(self, series, key)
   4373             return self._engine.get_value(s, k,
-> 4374                                           tz=getattr(series.dtype, 'tz', None))
   4375         except KeyError as e1:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine._get_loc_duplicates()

TypeError: '<' not supported between instances of 'str' and 'int'

During handling of the above exception, another exception occurred:

IndexError                                Traceback (most recent call last)
<ipython-input-6-62b157b95ab1> in <module>
----> 1 df = stock.get_index_ohlcv_by_date("20200101", "20200831", "코스피")
      2 print(df.head())

~\Anaconda3\lib\site-packages\pykrx\stock\api.py in get_index_ohlcv_by_date(fromdate, todate, ticker, freq)
    318     if isinstance(todate, datetime.datetime):
    319         todate = _datetime2string(todate)
--> 320     return _get_index_ohlcv_by_date(fromdate, todate, ticker, freq)
    321 
    322 

~\Anaconda3\lib\site-packages\pykrx\stock\api.py in _get_index_ohlcv_by_date(fromdate, todate, ticker, freq)
    306     """
    307     id = krx.IndexTicker().get_id(ticker, fromdate)
--> 308     market = krx.IndexTicker().get_market(ticker, fromdate)
    309     df = krx.get_index_ohlcv_by_date(fromdate, todate, id, market)
    310     how = {'시가': 'first', '고가': 'max', '저가': 'min', '종가': 'last', '거래량': 'sum'}

~\Anaconda3\lib\site-packages\pykrx\website\krx\market\ticker.py in get_market(self, ticker, date)
    160         self._download_ticker(date)
    161         cond = self.df.index == ticker
--> 162         return self.df.loc[cond, 'ind_tp_cd'][0]
    163 
    164     @staticmethod

~\Anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
    866         key = com.apply_if_callable(key, self)
    867         try:
--> 868             result = self.index.get_value(self, key)
    869 
    870             if not is_scalar(result):

~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_value(self, series, key)
   4392             # python 3
   4393             if is_scalar(key):  # pragma: no cover
-> 4394                 raise IndexError(key)
   4395             raise InvalidIndexError(key)
   4396 

IndexError: 0

Python 3.7.6

mr-yoo commented 4 years ago

image

위와 같은 데이터 프레임에서 다음 코드를 실행하니 에러가 출력 됨

cond = self.df.index == '코스피'
self.df.loc[cond, 'ind_tp_cd'][0]

Series 객체에 대해 [0] 직접 인덱싱하는 부분에서 에러 발생 함. iloc[0]로 변경하니 정상 동작함

mr-yoo commented 4 years ago

df.loc[cond, 'ind_tp_cd'] 자체가 하나의 값을 반환해야 하는데, 중복된 값들이 저장된 결과를 서버가 반환 함. (요청이 잘못 됐을 수 있음). root cause를 제거하는게 좋아 보임.

Index 및 Ticker 코드를 정리하는게 좋아 보임. 다음 이슈도 관련돼 있음. https://github.com/sharebook-kr/pykrx/issues/3

mr-yoo commented 4 years ago

다음 이슈에서 버그 수정 및 refactoring 함

모듈 업데이트

import pykrx
print(pykrx.__version__)

0.1.37

API 동작 확인

from pykrx import stock 

df = stock.get_index_ohlcv_by_date("20200101", "20200831", "코스피")
df.head()