Closed jtravs closed 2 years ago
I am having a look at the moment and I can confirm that something is strange here. Actually the issue seems to be during download.
That seems correct. If I set throw=true
on line 259 of database.jl
response = Downloads.request(
url;
output=tmp_file,
progress=verbose ? print_progress : nothing,
throw=true
)
I get:
julia> fetch!("test", [32], 0, 12000, :standard)
[ Info: No custom HITRAN database specified, opening 'HITRAN.sqlite' (default)
ERROR: HTTP/1.1 200 OK (transfer closed with 309863 bytes remaining to read) while requesting https://hitran.org/lbl/api?iso_ids_list=32&numin=0.00&numax=12000.00&fixwidth=0&sep=[comma]&request_params=global_iso_id,trans_id,molec_id,local_iso_id,nu,sw,a,elower,gamma_air,delta_air,gamma_self,n_air,n_self,gp,gpp
Stacktrace:
[1] (::Downloads.var"#9#18"{IOStream, Base.DevNull, Nothing, Vector{Pair{String, String}}, Float64, Nothing, Bool, Bool, String, Int64, Bool, Bool})(easy::Downloads.Curl.Easy)
@ Downloads C:\Users\jt52\AppData\Local\Programs\Julia-1.7.0\share\julia\stdlib\v1.7\Downloads\src\Downloads.jl:369
[2] with_handle(f::Downloads.var"#9#18"{IOStream, Base.DevNull, Nothing, Vector{Pair{String, String}}, Float64, Nothing, Bool, Bool, String, Int64, Bool, Bool}, handle::Downloads.Curl.Easy)
@ Downloads.Curl C:\Users\jt52\AppData\Local\Programs\Julia-1.7.0\share\julia\stdlib\v1.7\Downloads\src\Curl\Curl.jl:64
[3] #8
@ C:\Users\jt52\AppData\Local\Programs\Julia-1.7.0\share\julia\stdlib\v1.7\Downloads\src\Downloads.jl:311 [inlined]
[4] open(f::Downloads.var"#8#17"{Base.DevNull, Nothing, Vector{Pair{String, String}}, Float64, Nothing, Bool, Bool, String, Int64, Bool, Bool}, args::String; kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:write,), Tuple{Bool}}})
@ Base .\io.jl:330
...
I have narrowed it down. The problem seems to be that the HITRAN server announces more data than it actually sends (you can also see this in your error report). At least that is what it seems, could also be a problem with Downloads.jl. I will have a more in-depth look at the problem. The ugly fix would be to ignore the length mismatch...
Could you try the current master branch? I had a look at the details:
Therefore I have used the easy hook construct for Downloads.jl to modify the Curl options to ignore the invalid Content Length. It works for me now for your example cases. Could you confirm this?
You can use the force option, to force a re-download even if the data has already been downloaded.
fetch!("test", [32], 0, 12000, :standard; force=true)
That appears to work better, in that it can download data.
However, for some requests, such as
fetch!("test3", [32], 0, 12000, :standard)
I get this error:
ERROR: HTTP/1.1 200 OK (Operation too slow. Less than 1 bytes/sec transferred the last 20 seconds) while requesting https://hitran.org/lbl/api?iso_ids_list=32&numin=0.00&numax=12000.00&fixwidth=0&sep=[comma]&request_params=global_iso_id,trans_id,molec_id,local_iso_id,nu,sw,a,elower,gamma_air,delta_air,gamma_self,n_air,n_self,gp,gpp
Stacktrace:
[1] (::Downloads.var"#9#18"{IOStream, Base.DevNull, Nothing, Vector{Pair{String, String}}, Int64, Downloads.var"#24#27"{typeof(HITRAN.print_progress)}, Bool, Nothing, Bool, String, Bool, Bool})(easy::Downloads.Curl.Easy)
@ Downloads C:\Users\jt52\AppData\Local\Programs\Julia-1.7.3\share\julia\stdlib\v1.7\Downloads\src\Downloads.jl:387
[2] with_handle(f::Downloads.var"#9#18"{IOStream, Base.DevNull, Nothing, Vector{Pair{String, String}}, Int64, Downloads.var"#24#27"{typeof(HITRAN.print_progress)}, Bool, Nothing, Bool, String, Bool, Bool}, handle::Downloads.Curl.Easy)
@ Downloads.Curl C:\Users\jt52\AppData\Local\Programs\Julia-1.7.3\share\julia\stdlib\v1.7\Downloads\src\Curl\Curl.jl:88
[3] #8
@ C:\Users\jt52\AppData\Local\Programs\Julia-1.7.3\share\julia\stdlib\v1.7\Downloads\src\Downloads.jl:328 [inlined]
[4] open(f::Downloads.var"#8#17"{Base.DevNull, Nothing, Vector{Pair{String, String}}, Int64, Downloads.var"#24#27"{typeof(HITRAN.print_progress)}, Bool, Nothing, Bool, String, Bool, Bool}, args::String; kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:write,), Tuple{Bool}}})
@ Base .\io.jl:330
[5] arg_write(f::Function, arg::String)
@ ArgTools C:\Users\jt52\AppData\Local\Programs\Julia-1.7.3\share\julia\stdlib\v1.7\ArgTools\src\ArgTools.jl:86
[6] #7
@ C:\Users\jt52\AppData\Local\Programs\Julia-1.7.3\share\julia\stdlib\v1.7\Downloads\src\Downloads.jl:327 [inlined]
[7] arg_read
@ C:\Users\jt52\AppData\Local\Programs\Julia-1.7.3\share\julia\stdlib\v1.7\ArgTools\src\ArgTools.jl:61 [inlined]
[8] request(url::String; input::Nothing, output::String, method::Nothing, headers::Vector{Pair{String, String}}, timeout::Int64, progress::typeof(HITRAN.print_progress), verbose::Bool, debug::Nothing, throw::Bool, downloader::Downloads.Downloader)
@ Downloads C:\Users\jt52\AppData\Local\Programs\Julia-1.7.3\share\julia\stdlib\v1.7\Downloads\src\Downloads.jl:326
[9] download_HITRAN(url::String, parameters::Vector{String}; verbose::Bool)
@ HITRAN C:\Users\jt52\.julia\dev\HITRAN\src\database.jl:264
[10] download_HITRAN
@ C:\Users\jt52\.julia\dev\HITRAN\src\database.jl:253 [inlined]
[11] fetch!(db::SQLite.DB, name::String, global_ids::Vector{Int64}, ν_min::Int64, ν_max::Int64, parameters::Vector{String}; force::Bool)
@ HITRAN C:\Users\jt52\.julia\dev\HITRAN\src\database.jl:74
[12] fetch!(db::SQLite.DB, name::String, global_ids::Vector{Int64}, ν_min::Int64, ν_max::Int64, parameters::Symbol; force::Bool)
@ HITRAN C:\Users\jt52\.julia\dev\HITRAN\src\database.jl:124
[13] fetch!(name::String, global_ids::Vector{Int64}, ν_min::Int64, ν_max::Int64, parameters::Symbol; force::Bool)
@ HITRAN C:\Users\jt52\.julia\dev\HITRAN\src\database.jl:133
[14] fetch!(name::String, global_ids::Vector{Int64}, ν_min::Int64, ν_max::Int64, parameters::Symbol)
@ HITRAN C:\Users\jt52\.julia\dev\HITRAN\src\database.jl:133
[15] top-level scope
@ REPL[8]:1
Whereas, using Firefox I can download the file without difficulty. Maybe there is some kind of time out option we need to change?
Adding
Downloads.Curl.setopt(easy, Downloads.Curl.CURLOPT_LOW_SPEED_TIME, 100)
on line 262 of database.jl
fixed this, and it downloads just fine now after a small pause. I guess the hitran server takes longer when there are more lines to download.
I also checked that the downloaded file contents was the same size as that downloaded directly using the URL and they agreed.
Note that I also needed to increase the timeout to 100 s for some requests too, but then everything works well.
Thanks! I will have a look and integrate that. I hope HITRAN comes out with their APIv2 soon... However API keys will be mandatory for that version...
I will mark this as fixed for now and tag a new release. Let me know if there are other issues...
If I run:
(or other variations, such as using
iso_id("CH4")
) no data appears to be stored in the database table "test". I checked this by both trying to calculate the absorption coefficient, and also by using an sqlite browser. If I run the same with O2, e.g.Then data is saved to the database.
What is odd, is that if I directly download the data using the URL returned by
then data is returned.
To try and understand what is happening, I also checked the temporary data files you use to store the data from the URL before loading it into the database, and these remain empty for the CH4 parameters. So the data is being lost somewhere?