sharebook-kr / pykrx

KRX 주식 정보 스크래핑
661 stars 228 forks source link

Can't get market fundamental and other info. since 18:13, July 5 2024 #187

Open ericseong opened 5 days ago

ericseong commented 5 days ago

Not only for market fundamental, it seems that we're getting the data in empty frame for most of the APIs. Find the test result below: Test environment: pykrx 1.0.45 running python 3.11.7 on macOS 14.5 Test code:

from pykrx import stock

# 1
print('#1:')
try:
  df = stock.get_market_fundamental("20240705", market="KOSDAQ")
  print(df.head(2))
except Exception as e:
  print(f'Exception occurred: {e}')

# 2
print('#2:')
try:
  df = stock.get_market_ohlcv("20240701", "20240705", "005930", adjusted=True)
  print(df.head(3))
except Exception as e:
  print(f'Exception occurred: {e}')

# 3
print('#3:')
try:
  df = stock.get_market_trading_value_by_date("20240701", "20240705", "005930")
  print(df.head(3))
except Exception as e:
  print(f'Exception occurred: {e}')

# 4
print('#4:')
try:
  df = stock.get_market_trading_volume_by_investor("20240701", "20240705", "005930")
  print(df.head())
except Exception as e:
  print(f'Exception occurred: {e}')

Test result:

#1:
Exception occurred: "None of [Index(['BPS', 'PER', 'PBR', 'EPS', 'DIV', 'DPS'], dtype='object')] are in the [columns]"
#2:
               시가     고가     저가     종가       거래량       등락률
날짜                                                        
2024-07-01  81500  82100  81300  81800  11317202  0.368098
2024-07-02  82500  82600  81500  81800  14471904  0.000000
2024-07-03  82300  82300  81000  81800  11440328  0.000000
#3:
Empty DataFrame
Columns: []
Index: []
#4:
Exception occurred: '거래량'
grandmagoldenaxe commented 4 days ago

전반적으로 무언가 빈값을 불러오는 문제가 새롭게 발생된 것 같습니다.

liante0904 commented 4 days ago

Not only for market fundamental, it seems that we're getting the data in empty frame for most of the APIs. Find the test result below: Test environment: pykrx 1.0.45 running python 3.11.7 on macOS 14.5 Test code:

from pykrx import stock

# 1
print('#1:')
try:
  df = stock.get_market_fundamental("20240705", market="KOSDAQ")
  print(df.head(2))
except Exception as e:
  print(f'Exception occurred: {e}')

# 2
print('#2:')
try:
  df = stock.get_market_ohlcv("20240701", "20240705", "005930", adjusted=True)
  print(df.head(3))
except Exception as e:
  print(f'Exception occurred: {e}')

# 3
print('#3:')
try:
  df = stock.get_market_trading_value_by_date("20240701", "20240705", "005930")
  print(df.head(3))
except Exception as e:
  print(f'Exception occurred: {e}')

# 4
print('#4:')
try:
  df = stock.get_market_trading_volume_by_investor("20240701", "20240705", "005930")
  print(df.head())
except Exception as e:
  print(f'Exception occurred: {e}')

Test result:

#1:
Exception occurred: "None of [Index(['BPS', 'PER', 'PBR', 'EPS', 'DIV', 'DPS'], dtype='object')] are in the [columns]"
#2:
               시가     고가     저가     종가       거래량       등락률
날짜                                                        
2024-07-01  81500  82100  81300  81800  11317202  0.368098
2024-07-02  82500  82600  81500  81800  14471904  0.000000
2024-07-03  82300  82300  81000  81800  11440328  0.000000
#3:
Empty DataFrame
Columns: []
Index: []
#4:
Exception occurred: '거래량'

It seems the error occurs while scraping the Korean Data Information System. It appears that they started checking the referrer in the response header. I can’t tell exactly how it differs from other classes just by looking, but temporarily, you can modify line 22 in webio.py located in /pykrx/website/comm/webio.py to:

self.headers = { 'User-Agent': 'Mozilla/5.0', 'Referer': 'http://data.krx.co.kr/'}

This should resolve the issue for now. If other classes also lack the referrer in their headers, you may need to modify them similarly.

ericseong commented 3 days ago

@liante0904, yes, the fix around post/get header works for all my data gathering needs. Thanks for your timely fix!