是否考虑数据本地化功能。

Tony-Wang commented 9 years ago

tushare是不是能提供数据本地化的功能。

让用户可以在本地保持所有数据的副本，提供一个统一的数据访问接口，首先访问本地数据看是否命中，如果没有命中，自动去网上下载相应数据并在本地保存。这样的功能基本上所有使用tushare作为量化研究的数据来源应用都需要实现的。

jimmysoa commented 9 years ago

@Tony-Wang 严格来说应该是pandas具备这个功能，当然tushare可以再封装一下，您的建议很好，后面有时间了我重新整理一下，谢谢。

gclsoft commented 9 years ago

df = ts.get_h_data(stockCode, start=startDate, end=endDate)

能否如果下载过了,就不用再网上获取,不然很慢,一直获取也会被网站禁掉

http://tushare.org/storing.html 存储了df到excel,怎么再读取回来?

cedricporter commented 9 years ago

@gclsoft DataFrame可以存pickle.

import pandas as pd

# 保存
df.to_pickle(filename)

# 读取
df = pd.read_pickle(filename)

gclsoft commented 9 years ago

from dateutil import parser
import os.path
sleepTime = 1.0

endDateTime = parser.parse(endDate)
#def getHistory(stockCode)
def getFolder():
    return "./stockData/"+startDate+"_"+endDate

if not os.path.exists(getFolder()):
    os.makedirs(getFolder())

def saveFileName(stockCode):
    filename=getFolder()+"/"+stockCode+".dat"
    return filename

def getHistory(stockCode):
# def getStock(stockCode):
    fname=saveFileName(stockCode)
    if os.path.isfile(fname):
        df = pd.read_pickle(fname)
        if df is not None:
            return df

    global sleepTime
    try:
        print stockCode
        df = ts.get_h_data(stockCode, start=startDate, end=endDate)
        if df is not None:
            df.to_pickle(fname)
        #else: #TODO:如果是停牌要保存空的文件

        if sleepTime > 0.9:
                sleepTime = sleepTime * 0.9

        return df
    except:
        print "sleep:" + str(sleepTime)
        sleep(sleepTime)

        sleepTime = sleepTime * 1.1
        pass
        return getHistory(stockCode)  # -1,-1,-1,-1,-1,"",-1

    return nil

谢谢 @cedricporter @jimmysoa 如果是停牌要保存空的文件,不然的话这个股票还是要从网上获取,要怎么做才能下次读取到这个空文件时返回的df包含停牌这个bool值?

cedricporter commented 9 years ago

@gclsoft

如果你的startDate，endDate间隔很长的话，停牌的话还是有其他历史数据的。至于停牌判断，你看看df的当天有没有数据就可以知道是否停牌了。

gclsoft commented 9 years ago

@cedricporter 我的意思是怎么怎么保存一个bool值到df里. 这样我就知道这个是停牌了,就不再网络获取了

d = {'isStop' : 1}
df = DataFrame(d)

试了下不行

gclsoft commented 9 years ago

比如df = DataFrame({'isStop': [1]})
df["isStop"]就获取不到这个值

cedricporter commented 9 years ago

@gclsoft 为啥要这样

gclsoft commented 9 years ago

@cedricporter 保存起来,下次就不用从网络获取并返回None出来,我自己提前就知道了