niosh-mining / obsplus

A Pandas-Centric ObsPy Expansion Pack
GNU Lesser General Public License v3.0
38 stars 8 forks source link

Event schema #218

Closed d-chambers closed 3 years ago

d-chambers commented 3 years ago

This PR adds an event schema using pydantic then uses pydantic to to perform the conversions of obspy.Catalog objects to and from json/dicts.

This is primarily an internal change and shouldn't really effect ObsPlus' external API.

d-chambers commented 3 years ago

I should mention I had to slightly modify test catalog 4 (tests/test_data/test_catalogs/cat4.xml) because ObsPy's quakeml parser expected ints for CompositeTime's year, month, day, hour, and minute uncertainties. However, this is inconsistent with QuantityErrorspecifying its types as floats. Since this is such an obscure corner case (I have never seen a CompositeTime in the wild) I just removed the uncertainties from the test catalog.

codecov-io commented 3 years ago

Codecov Report

Merging #218 (5cc9800) into master (6a516e4) will increase coverage by 0.36%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #218      +/-   ##
==========================================
+ Coverage   97.45%   97.81%   +0.36%     
==========================================
  Files          37       38       +1     
  Lines        4320     4577     +257     
  Branches      677      659      -18     
==========================================
+ Hits         4210     4477     +267     
+ Misses         52       45       -7     
+ Partials       58       55       -3     
Flag Coverage Δ
unittests 97.81% <100.00%> (+0.36%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
obsplus/events/json.py 100.00% <100.00%> (+13.69%) :arrow_up:
obsplus/events/schema.py 100.00% <100.00%> (ø)
obsplus/utils/bank.py 95.92% <100.00%> (+0.07%) :arrow_up:
obsplus/utils/events.py 98.11% <100.00%> (-0.34%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 6a516e4...5cc9800. Read the comment docs.

d-chambers commented 3 years ago

I ended up finding a bug where WaveBank couldn't be pickled because the index cached used itertools.cycle. I replaced this with a simpler incrementing index attribute.

d-chambers commented 3 years ago

Anyone have a sec to review this? I am not too worried about it since the external APIs haven't changed but another pair of eyes couldn't hurt.

d-chambers commented 3 years ago

Thanks @calum-chamberlain, I think I addressed all your concerns, thanks for taking a look at it.

I also added a changelog entry and restructured cat_to_dict to use the pydantic model's dict method rather than first converting to json which resulted in a minor speedup.

d-chambers commented 3 years ago

I also should document here that I timed the json test suite before and after this change. Before using pydantic the test suite took on average 14.5 seconds. After using pydantic the test suite takes about 15 seconds. Considering the huge improvement to the code readability and elimination of recursive functions I think the minor slow-down is worth it.