jpjones76 / SeisIO.jl

Julia language support for geophysical time series data
http://seisio.readthedocs.org
Other
47 stars 21 forks source link

ArgumentError on reading ASDF file #84

Closed niyiyu closed 2 years ago

niyiyu commented 3 years ago

Hi!

I was using SeisIO for reading ASDF files generated from SPECFEM3D devel. And the working environment looks like this:

 pkg> status

Status `./Project.toml`

 [fbe9abb3] AWS v1.36.0

 [1c724243] AWSS3 v0.8.3

 [621f4979] AbstractFFTs v1.0.1

 [336ed68f] CSV v0.8.5

 [a93c6f00] DataFrames v0.21.8

 [c27321d9] Glob v1.3.0

 [f67ccb44] HDF5 v0.13.7

 [033835bb] JLD2 v0.2.4

 [c8c83da1] Parallelism v0.1.2

 [91a5bcdd] Plots v1.16.5

 [438e738f] PyCall v1.92.3

 [d330b81b] PyPlot v2.9.0

 [b054d04a] SCEDC v0.1.0 `https://github.com/tclements/SCEDC.jl#master`

 [b372bb87] SeisIO v1.2.1

 [8cc7c3c0] SeisNoise v0.5.0

 [09ab397b] StructArrays v0.5.1

 [ade2ca70] Dates

 [8ba89e20] Distributed

 [10745b16] Statistics

The asdf file synthetic.h5(you can get it here, ~1.4M) contains a single shot's seismogram at 15 receivers, sampling at 1.5s@5000Hz (from simulation). And when try to take a look at the file with h5 binary executable (h5dump, h5ls, etc), it's fine. But when reading with SeisIO, I got these parsing error:

**julia>** read_hdf5("synthetic.h5", "2000-01-01T00:59:45", "2000-01-01T00:59:46")

****ERROR:** ArgumentError: cannot parse "***********" as Float64**

Stacktrace:

 [1] **_parse_failure(**::Type{T} where T, ::String, ::Int64, ::Int64**)** at **./parse.jl:370** (repeats 2 times)

 [2] **#tryparse_internal#364** at **./parse.jl:366** [inlined]

 [3] **tryparse_internal** at **./parse.jl:364** [inlined]

 [4] **#parse#365** at **./parse.jl:376** [inlined]

 [5] **parse** at **./parse.jl:376** [inlined]

 [6] **FDSN_sta_xml(**::String, ::Bool, ::String, ::String, ::Int64**)** at **/Users/niyiyu/.julia/packages/SeisIO/JgSIN/src/Formats/stationXML.jl:168**

 [7] **read_station_xml!(**::SeisData, ::String, ::String, ::String, ::Bool, ::Int64**)** at **/Users/niyiyu/.julia/packages/SeisIO/JgSIN/src/Formats/stationXML.jl:458**

 [8] **read_asdf!(**::SeisData, ::String, ::String, ::String, ::String, ::Bool, ::Int64**)** at **/Users/niyiyu/.julia/packages/SeisIO/JgSIN/src/Submodules/SeisHDF/read_asdf.jl:87**

 [9] **read_asdf** at **/Users/niyiyu/.julia/packages/SeisIO/JgSIN/src/Submodules/SeisHDF/read_asdf.jl:160** [inlined]

 [10] **read_hdf5!(**::SeisData, ::String, ::String, ::String; fmt::String, id::String, msr::Bool, v::Int64**)** at **/Users/niyiyu/.julia/packages/SeisIO/JgSIN/src/Submodules/SeisHDF/read_hdf5.jl:34**

 [11] **read_hdf5(**::String, ::String, ::String; fmt::String, id::String, msr::Bool, v::Int64**)** at **/Users/niyiyu/.julia/packages/SeisIO/JgSIN/src/Submodules/SeisHDF/read_hdf5.jl:70**

 [12] **read_hdf5(**::String, ::String, ::String**)** at **/Users/niyiyu/.julia/packages/SeisIO/JgSIN/src/Submodules/SeisHDF/read_hdf5.jl:69**

 [13] top-level scope at **REPL[9]:1**
jpjones76 commented 2 years ago

It reads correctly into HDF5 because HDF5 doesn't try to parse the XML part of the ASDF header. The error being thrown is a problem with the XML parsing, not HDF5. The XML is part of the ASDF format.

To the best of my knowledge, the presence of non-numeric data in ASDF XML fields that expect numeric values (e.g. latitude, which throws the error) makes those synthetic files invalid ASDF. I suspect that this file will fail to read using ASDF packages in other languages, too.

If this is not the case, e.g., if the file reads correctly with ASDF for Python, please reopen this issue. However, that will mean that the correct handling of non-numeric data in numeric XML fields is tribal knowledge, as I can find no documentation for it. As such it may be some time before I find the necessary information for a workaround.