oxinabox / DataDepsGenerators.jl

Utility for developers to help define DataDeps registration blocks, for reusing existing Data with DataDeps.jl
Other
18 stars 6 forks source link

Some README examples failing #67

Open ghost opened 5 years ago

ghost commented 5 years ago

Some appear to be HTTP issues and others possible syntax changes in Gumbo.

julia version 1.0.1```

DataOneV1

julia> generate(DataOneV1(), "https://datadryad.org/resource/doi:10.5061/dryad.74699", "Wild Crop Genomics")
ERROR: BoundsError: attempt to access 0-element Array{Gumbo.HTMLNode,1} at index [1]
Stacktrace:
 [1] getindex at ./array.jl:731 [inlined]
 [2] first at ./abstractarray.jl:270 [inlined]
 [3] data_fullname(::DataOneV1, ::Gumbo.HTMLDocument) at /.julia/packages/DataDepsGenerators/CbcdV/src/APIs/DataOneV1.jl:52
 [4] find_metadata(::DataOneV1, ::String, ::String) at /.julia/packages/DataDepsGenerators/CbcdV/src/generate.jl:73
 [5] #generate#7 at /.julia/packages/DataDepsGenerators/CbcdV/src/generate.jl:25 [inlined]
 [6] generate(::DataOneV1, ::String, ::String) at /.julia/packages/DataDepsGenerators/CbcdV/src/generate.jl:25
 [7] top-level scope at none:0

DataOneV2

julia> generate(KnowledgeNetworkforBiocomplexity(), "https://knb.ecoinformatics.org/knb/d1/mn/v2/object/doi:10.5063/F1T43R7N")
ERROR: IOError(EOFError() during request(https://knb.ecoinformatics.org/knb/d1/mn/v2/object/doi:10.5063/F1T43R7N))

Stacktrace:
 [1] readuntil(::HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext}, ::typeof(HTTP.Parsers.find_end_of_header)) at /.julia/packages/HTTP/YjRCz/src/IOExtras.jl:168
 [2] readheaders(::HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext}, ::HTTP.Messages.Response) at /.julia/packages/HTTP/YjRCz/src/Messages.jl:469
 [3] startread(::HTTP.Streams.Stream{HTTP.Messages.Response,HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext}}) at /.julia/packages/HTTP/YjRCz/src/Streams.jl:149
 [4] macro expansion at ./task.jl:263 [inlined]
 [5] macro expansion at /.julia/packages/HTTP/YjRCz/src/StreamRequest.jl:56 [inlined]
 [6] macro expansion at ./task.jl:244 [inlined]
 [7] #request#1(::Nothing, ::Nothing, ::Int64, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Type{HTTP.StreamRequest.StreamLayer}, ::HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext}, ::HTTP.Messages.Request, ::Array{UInt8,1}) at /.julia/packages/HTTP/YjRCz/src/StreamRequest.jl:54
 [8] (::getfield(HTTP, Symbol("#kw##request")))(::NamedTuple{(:iofunction,),Tuple{Nothing}}, ::typeof(HTTP.request), ::Type{HTTP.StreamRequest.StreamLayer}, ::HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext}, ::HTTP.Messages.Request, ::Array{UInt8,1}) at ./none:0
 [9] #request#1(::Nothing, ::Type, ::Int64, ::Base.Iterators.Pairs{Symbol,Nothing,Tuple{Symbol},NamedTuple{(:iofunction,),Tuple{Nothing}}}, ::Function, ::Type{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}, ::HTTP.URIs.URI, ::HTTP.Messages.Request, ::Array{UInt8,1}) at /.julia/packages/HTTP/YjRCz/src/ConnectionRequest.jl:52
 [10] #request at ./none:0 [inlined]
 [11] #request#1 at /.julia/packages/HTTP/YjRCz/src/ExceptionRequest.jl:19 [inlined]
 [12] (::getfield(HTTP, Symbol("#kw##request")))(::NamedTuple{(:iofunction,),Tuple{Nothing}}, ::typeof(HTTP.request), ::Type{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}, ::HTTP.URIs.URI, ::HTTP.Messages.Request, ::Array{UInt8,1}) at ./none:0
 [13] (::getfield(Base, Symbol("###44#45#46")){ExponentialBackOff,getfield(HTTP.RetryRequest, Symbol("##2#3")){Bool,HTTP.Messages.Request},typeof(HTTP.request)})(::Base.Iterators.Pairs{Symbol,Nothing,Tuple{Symbol},NamedTuple{(:iofunction,),Tuple{Nothing}}}, ::Function, ::Type, ::Vararg{Any,N} where N) at ./error.jl:216
 [14] ##44#47 at ./none:0 [inlined]
 [15] #request#1 at /.julia/packages/HTTP/YjRCz/src/RetryRequest.jl:44 [inlined]
 [16] #request at ./none:0 [inlined]
 [17] #request#1(::VersionNumber, ::String, ::Nothing, ::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Type{HTTP.MessageRequest.MessageLayer{HTTP.RetryRequest.RetryLayer{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}}}, ::String, ::HTTP.URIs.URI, ::Array{Pair{SubString{String},SubString{String}},1}, ::Array{UInt8,1}) at /.julia/packages/HTTP/YjRCz/src/MessageRequest.jl:47
 [18] request at /.julia/packages/HTTP/YjRCz/src/MessageRequest.jl:28 [inlined]
 [19] #request#1(::Int64, ::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Type{HTTP.RedirectRequest.RedirectLayer{HTTP.MessageRequest.MessageLayer{HTTP.RetryRequest.RetryLayer{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}}}}, ::String, ::HTTP.URIs.URI, ::Array{Pair{SubString{String},SubString{String}},1}, ::Array{UInt8,1}) at /.julia/packages/HTTP/YjRCz/src/RedirectRequest.jl:24
 [20] request(::Type{HTTP.RedirectRequest.RedirectLayer{HTTP.MessageRequest.MessageLayer{HTTP.RetryRequest.RetryLayer{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}}}}, ::String, ::HTTP.URIs.URI, ::Array{Pair{SubString{String},SubString{String}},1}, ::Array{UInt8,1}) at /.julia/packages/HTTP/YjRCz/src/RedirectRequest.jl:21
 [21] #request#3(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::String, ::HTTP.URIs.URI, ::Array{Pair{SubString{String},SubString{String}},1}, ::Array{UInt8,1}) at /.julia/packages/HTTP/YjRCz/src/HTTP.jl:302
 [22] request at /.julia/packages/HTTP/YjRCz/src/HTTP.jl:302 [inlined]
 [23] #request#4 at /.julia/packages/HTTP/YjRCz/src/HTTP.jl:316 [inlined]
 [24] request at /.julia/packages/HTTP/YjRCz/src/HTTP.jl:312 [inlined] (repeats 2 times)
 [25] getpage_raw at /.julia/packages/DataDepsGenerators/CbcdV/src/utils.jl:34 [inlined]
 [26] #52 at ./operators.jl:832 [inlined]
 [27] mainpage_url at /.julia/packages/DataDepsGenerators/CbcdV/src/generate.jl:123 [inlined]
 [28] find_metadata(::KnowledgeNetworkforBiocomplexity, ::String, ::Nothing) at /.julia/packages/DataDepsGenerators/CbcdV/src/generate.jl:72
 [29] #generate#7 at /.julia/packages/DataDepsGenerators/CbcdV/src/generate.jl:25 [inlined]
 [30] generate at /.julia/packages/DataDepsGenerators/CbcdV/src/generate.jl:25 [inlined] (repeats 2 times)
 [31] top-level scope at none:0
julia> generate(ArcticDataCenter(), "https://knb.ecoinformatics.org/knb/d1/mn/v2/object/doi:10.5063%2FF1HT2M7Q")
ERROR: IOError(EOFError() during request(https://knb.ecoinformatics.org/knb/d1/mn/v2/object/doi:10.5063%2FF1HT2M7Q))

Stacktrace:
 [1] readuntil(::HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext}, ::typeof(HTTP.Parsers.find_end_of_header)) at /.julia/packages/HTTP/YjRCz/src/IOExtras.jl:168
 [2] readheaders(::HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext}, ::HTTP.Messages.Response) at /.julia/packages/HTTP/YjRCz/src/Messages.jl:469
 [3] startread(::HTTP.Streams.Stream{HTTP.Messages.Response,HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext}}) at /.julia/packages/HTTP/YjRCz/src/Streams.jl:149
 [4] macro expansion at ./task.jl:263 [inlined]
 [5] macro expansion at /.julia/packages/HTTP/YjRCz/src/StreamRequest.jl:56 [inlined]
 [6] macro expansion at ./task.jl:244 [inlined]
 [7] #request#1(::Nothing, ::Nothing, ::Int64, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Type{HTTP.StreamRequest.StreamLayer}, ::HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext}, ::HTTP.Messages.Request, ::Array{UInt8,1}) at /.julia/packages/HTTP/YjRCz/src/StreamRequest.jl:54
 [8] (::getfield(HTTP, Symbol("#kw##request")))(::NamedTuple{(:iofunction,),Tuple{Nothing}}, ::typeof(HTTP.request), ::Type{HTTP.StreamRequest.StreamLayer}, ::HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext}, ::HTTP.Messages.Request, ::Array{UInt8,1}) at ./none:0
 [9] #request#1(::Nothing, ::Type, ::Int64, ::Base.Iterators.Pairs{Symbol,Nothing,Tuple{Symbol},NamedTuple{(:iofunction,),Tuple{Nothing}}}, ::Function, ::Type{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}, ::HTTP.URIs.URI, ::HTTP.Messages.Request, ::Array{UInt8,1}) at /.julia/packages/HTTP/YjRCz/src/ConnectionRequest.jl:52
 [10] #request at ./none:0 [inlined]
 [11] #request#1 at /.julia/packages/HTTP/YjRCz/src/ExceptionRequest.jl:19 [inlined]
 [12] (::getfield(HTTP, Symbol("#kw##request")))(::NamedTuple{(:iofunction,),Tuple{Nothing}}, ::typeof(HTTP.request), ::Type{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}, ::HTTP.URIs.URI, ::HTTP.Messages.Request, ::Array{UInt8,1}) at ./none:0
 [13] (::getfield(Base, Symbol("###44#45#46")){ExponentialBackOff,getfield(HTTP.RetryRequest, Symbol("##2#3")){Bool,HTTP.Messages.Request},typeof(HTTP.request)})(::Base.Iterators.Pairs{Symbol,Nothing,Tuple{Symbol},NamedTuple{(:iofunction,),Tuple{Nothing}}}, ::Function, ::Type, ::Vararg{Any,N} where N) at ./error.jl:216
 [14] ##44#47 at ./none:0 [inlined]
 [15] #request#1 at /.julia/packages/HTTP/YjRCz/src/RetryRequest.jl:44 [inlined]
 [16] #request at ./none:0 [inlined]
 [17] #request#1(::VersionNumber, ::String, ::Nothing, ::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Type{HTTP.MessageRequest.MessageLayer{HTTP.RetryRequest.RetryLayer{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}}}, ::String, ::HTTP.URIs.URI, ::Array{Pair{SubString{String},SubString{String}},1}, ::Array{UInt8,1}) at /.julia/packages/HTTP/YjRCz/src/MessageRequest.jl:47
 [18] request at /.julia/packages/HTTP/YjRCz/src/MessageRequest.jl:28 [inlined]
 [19] #request#1(::Int64, ::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Type{HTTP.RedirectRequest.RedirectLayer{HTTP.MessageRequest.MessageLayer{HTTP.RetryRequest.RetryLayer{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}}}}, ::String, ::HTTP.URIs.URI, ::Array{Pair{SubString{String},SubString{String}},1}, ::Array{UInt8,1}) at /.julia/packages/HTTP/YjRCz/src/RedirectRequest.jl:24
 [20] request(::Type{HTTP.RedirectRequest.RedirectLayer{HTTP.MessageRequest.MessageLayer{HTTP.RetryRequest.RetryLayer{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}}}}, ::String, ::HTTP.URIs.URI, ::Array{Pair{SubString{String},SubString{String}},1}, ::Array{UInt8,1}) at /.julia/packages/HTTP/YjRCz/src/RedirectRequest.jl:21
 [21] #request#3(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::String, ::HTTP.URIs.URI, ::Array{Pair{SubString{String},SubString{String}},1}, ::Array{UInt8,1}) at /.julia/packages/HTTP/YjRCz/src/HTTP.jl:302
 [22] request at /.julia/packages/HTTP/YjRCz/src/HTTP.jl:302 [inlined]
 [23] #request#4 at /.julia/packages/HTTP/YjRCz/src/HTTP.jl:316 [inlined]
 [24] request at /.julia/packages/HTTP/YjRCz/src/HTTP.jl:312 [inlined] (repeats 2 times)
 [25] getpage_raw at /.julia/packages/DataDepsGenerators/CbcdV/src/utils.jl:34 [inlined]
 [26] #52 at ./operators.jl:832 [inlined]
 [27] mainpage_url at /.julia/packages/DataDepsGenerators/CbcdV/src/generate.jl:123 [inlined]
 [28] find_metadata(::ArcticDataCenter, ::String, ::Nothing) at /.julia/packages/DataDepsGenerators/CbcdV/src/generate.jl:72
 [29] #generate#7 at /.julia/packages/DataDepsGenerators/CbcdV/src/generate.jl:25 [inlined]
 [30] generate at /.julia/packages/DataDepsGenerators/CbcdV/src/generate.jl:25 [inlined] (repeats 2 times)
 [31] top-level scope at none:0

Figshare

julia> generate(Figshare(), "10.5281/zenodo.1194927")
ERROR: BoundsError: attempt to access 0-element Array{Any,1} at index [1]
Stacktrace:
 [1] getindex(::Array{Any,1}, ::Int64) at ./array.jl:731
 [2] mainpage_url(::Figshare, ::String) at /.julia/packages/DataDepsGenerators/CbcdV/src/APIs/Figshare.jl:48
 [3] find_metadata(::Figshare, ::String, ::Nothing) at /.julia/packages/DataDepsGenerators/CbcdV/src/generate.jl:72
 [4] #generate#7 at /.julia/packages/DataDepsGenerators/CbcdV/src/generate.jl:25 [inlined]
 [5] generate at /.julia/packages/DataDepsGenerators/CbcdV/src/generate.jl:25 [inlined] (repeats 2 times)
 [6] top-level scope at none:0
oxinabox commented 5 years ago

Thanks. I will chase these up, unless @SebastinSanty gets to it first

oxinabox commented 5 years ago

No Github #69 only fixed one problem

oxinabox commented 5 years ago

@SebastinSanty I think we should just remove DataOneV1() we have like 4 other APIs to get the same information. most of the information in DataOneV1() is actually using dcterms etc in the DataDryad HTML, rather than the actual DataOneV1() And DataDryad has recently completely changed how they use dcterms breaking it utterly. It could be fixed, but I am not sure the point since it is redundant for DataDryad, which is all that it works for.

I think later we should consider making something that robustly works with dcterms, and works across sites.

SebastinSanty commented 5 years ago

Yes, we can safely remove DataOneV1() specially because as far as we know this DataOne version is only used by DataDryad. Also, DataDryad based registration blocks are generated even without that as you said.

oxinabox commented 5 years ago

2/3 fixed. (well 1 fixed, 1 deleted) All that is left is DataDepsV2 re: https://github.com/JuliaWeb/HTTP.jl/issues/342

According to Twitter on that DataOne wants to talk about the problem on there slack so I'll go talk to them.

datadavev commented 5 years ago

DataONE services use SSLVerifyClient optional for clients to optionally authenticate using a client certificate which is one reason for TLS renegotiation occurring. RFC 5746 identifies the TLS renegotiation extension that addresses the vulnerability described in CVE-2009-3555. RFC 5746 is implemented in OpenSSL versions 0.9.8m and 1.0.0a. Our systems are currently at OpenSSL 1.0.1f, so my understanding is that the vulnerability is resolved and hence the guidance to avoid TLS renegotiation should probably be revised.

Edit: The issue seems to be around TLS 1.3 which will not support renegotiation. It is not clear yet how Apache will support optional SSLVerifyClient with TLS 1.3.