dysonance / Temporal.jl

Time series implementation for the Julia language focused on efficiency and flexibility
Other
100 stars 25 forks source link

Yahoo API Bug #32

Closed dysonance closed 5 years ago

dysonance commented 5 years ago

Running the following code doesn't work (using latest master on Julia v1.0.0).

using Temporal
prices = yahoo("AAPL", from="2010-01-01")

I am getting the following error message.

ERROR: LoadError: MethodError: no method matching cookies(::HTTP.Messages.Response)
Closest candidates are:
  cookies(::HTTP.Messages.Request) at /Users/jacob.amos/.julia/packages/HTTP/nUK4f/src/cookies.jl:316
Stacktrace:
 [1] yahoo_get_crumb() at /Users/jacob.amos/.julia/dev/Temporal/src/io.jl:279
 [2] (::getfield(Temporal, Symbol("#kw##yahoo")))(::NamedTuple{(:from,),Tuple{String}}, ::typeof(yahoo), ::String) at ./none:0
 [3] top-level scope at none:0
 [4] include at ./boot.jl:317 [inlined]
 [5] include_relative(::Module, ::String) at ./loading.jl:1038
 [6] include(::Module, ::String) at ./sysimg.jl:29
 [7] include(::String) at ./client.jl:388
 [8] top-level scope at none:0

It would appear that the HTTP package functionality has changed. Temporal should resolve this so that Yahoo data fetches can work again.

bisraelsen commented 5 years ago

The following has worked for me. It is pretty hacky though.

As far as I can tell the way you are "supposed" to read cookies in the latest version of HTTP is HTTP.Cookies.readcookies(request.headers,""). However, that seems to return an empty list. I haven't made an issue on HTTP because there has already been a related question with little action (HTTP issue 360).

If/when readcookies from HTTP is fixed, then that will greatly simplify yahoo_get_crumb.

Anyway, I'm not sure if this is the elegant way to solve the problem, but it does work for me.

The HTTP.

function yahoo_get_crumb()::Tuple{SubString{String}, Dict{String, Set{HTTP.Cookies.Cookie}}}
    response = HTTP.request("GET",YAHOO_TMP)
    m = match(r"\"user\":{\"crumb\":\"(.*?)\"", String(response.body)).captures[1]
    h = response.headers

    ###########
    # manually grab and parse the cookie
    ###########
    function substring_after_equal(str;offset::Int64=1)
        idx = collect(findfirst("=",str))[1]
        return String(str[idx+offset:end])
    end

    function return_cookie_string(head)
        for x in head
            if occursin("cookie",lowercase(x[1]))
                return String(x[2])
            end
        end
        return nothing
    end

    cookie_str = return_cookie_string(h)

    split_h = split(cookie_str,";")
    n = String(split_h[1][1:1])
    v = String(split_h[1][3:end])
    expire = substring_after_equal(split_h[2])
    pth = substring_after_equal(split_h[3])
    dom = substring_after_equal(split_h[4],offset=2)

    c = HTTP.Cookies.Cookie(n,v,domain=dom,path=pth,expires=DateTime(expire,"e, d-u-y H:M:S G\\MT"),unparsed=[cookie_str])
    ###########

    c_dict = (m,Dict(m=>Set([c])))
    return c_dict
end

and

function yahoo(symb::String;
               from::String="1900-01-01",
               thru::String=string(Dates.today()),
               freq::String="d",
               event::String="history",
               crumb_tuple::Tuple{SubString{String}, Dict{String, Set{HTTP.Cookies.Cookie}}}=yahoo_get_crumb())::TS
    @assert freq in ["d","wk","mo"] "Argument `freq` must be either \"d\" (daily), \"wk\" (weekly), or \"mo\" (monthly)."
    @assert event in ["history","div","split"] "Argument `event` must be either \"history\", \"div\", or \"split\"."
    @assert from[5] == '-' && from[8] == '-' "Argument `from` has invalid date format."
    @assert thru[5] == '-' && thru[8] == '-' "Argument `thru` has invalid date format."
    period1 = Int(floor(Dates.datetime2unix(Dates.DateTime(from))))
    period2 = Int(floor(Dates.datetime2unix(Dates.DateTime(thru))))
    urlstr = "$(YAHOO_URL)/$(symb)?period1=$(period1)&period2=$(period2)&interval=1$(freq)&events=$(event)&crumb=$(crumb_tuple[1])"
    response = HTTP.request("POST",HTTP.URIs.URI(urlstr), cookies=true, cookiejar=crumb_tuple[2])
    indata = Temporal.csvresp(response)
    return TS(indata[1], indata[2], indata[3][2:end])
end
dysonance commented 5 years ago

Thanks for the help on this @bisraelsen! Could you make these changes to code and open a pull request on it so we can get this fix merged in and released to the public? Happy to help with that if you like, or else if you'd feel more comfortable with me putting those changes in directly I can try to work on that when I get time. Cheers and thanks again.

dysonance commented 5 years ago

@bisraelsen Just wanted to give you a quick update — I have incorporated your fix into master and it seems to be working for me. Thanks again for the assist on this, big help! Will push a new release to Julia METADATA soon so the rest of the public can make use of your fix. Cheers.

bisraelsen commented 5 years ago

@dysonance good to hear! I got pulled away to something else, or I would have done it.