Closed sboltz closed 4 years ago
Merging #199 into master will increase coverage by
0.00%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #199 +/- ##
=======================================
Coverage 97.36% 97.36%
=======================================
Files 37 37
Lines 4175 4179 +4
Branches 664 665 +1
=======================================
+ Hits 4065 4069 +4
Misses 52 52
Partials 58 58
Flag | Coverage Δ | |
---|---|---|
#unittests | 97.36% <100.00%> (+<0.01%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
obsplus/constants.py | 100.00% <ø> (ø) |
|
obsplus/utils/stations.py | 99.47% <ø> (-0.01%) |
:arrow_down: |
obsplus/events/validate.py | 94.80% <100.00%> (ø) |
|
obsplus/structures/dfextractor.py | 87.35% <100.00%> (+0.60%) |
:arrow_up: |
obsplus/utils/pd.py | 97.59% <100.00%> (+0.02%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 2e3d1f2...357d638. Read the comment docs.
This looks well thought out. I especially like how you are using a custom dtype for NSLC codes to replace a multitude of ad-hacks.
It looks like it still needs a changelog entry. Apart from that, is there anything left to do?
I do have one question about station codes. These are typically 3 characters, probably for historical reasons. I think the seed spec just requires they are between 1 and 5 chars though. It might be nice if station 1 became station "001" rather than "01" but I suppose it isn't that important. What do you think?
I do have one question about station codes. These are typically 3 characters, probably for historical reasons. I think the seed spec just requires they are between 1 and 5 chars though. It might be nice if station 1 became station "001" rather than "01" but I suppose it isn't that important. What do you think?
That is an interesting point. For simplicity, I would say that if the user wants to name their station "001", they should use the string convention that we have outlined. But that is largely out of laziness and because it's difficult to figure out what the appropriate thing to do there really is. For instance, for us we would almost certainly want station 1 to become station "1". The reason for having it be "01" is because all of the NSLC components are now using the same data type, and in particular I wanted to make sure it handles location codes properly.
Looks good
This PR standardizes and cleans up the way that NSLC information is handled, particularly if it contains numeric values (particularly location codes). This is necessary because pandas can do strange things to dtypes that result in unexpected behavior.
It should have the following behavior:
1.0
becomes"01"
)1
becomes"01"
)"1"
becomes"1"
and"01"
becomes"01"
)"1.0"
will not be allowed)This behavior should coerce things to follow the seed convention, but it will also allow for special cases where the NSLC are purely numeric.