mspass-team / mspass

Massive Parallel Analysis System for Seismologists
https://mspass.org
BSD 3-Clause "New" or "Revised" License
28 stars 11 forks source link

Fix several bugs in OriginTimeMatcher #533

Closed pavlis closed 2 months ago

pavlis commented 2 months ago

Fixes a couple bad bugs in OriginTimeMatcher. If this passes tests as it should it should be merged immediately.

pavlis commented 2 months ago

Obviously my assumption these fixes would pass the tests was incorrect. Obviously not ready to merge

codecov[bot] commented 2 months ago

Codecov Report

Attention: Patch coverage is 0% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 54.58%. Comparing base (33e74d8) to head (38a8eac).

:exclamation: Current head 38a8eac differs from pull request most recent head 830ea8b. Consider uploading reports for the commit 830ea8b to get more accurate results

Files Patch % Lines
python/mspasspy/db/normalize.py 0.00% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #533 +/- ## ======================================= Coverage 54.57% 54.58% ======================================= Files 146 146 Lines 22362 22362 Branches 1336 1336 ======================================= + Hits 12204 12206 +2 + Misses 9818 9816 -2 Partials 340 340 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

pavlis commented 2 months ago

The latest push of this is still running, but the fixes here have revealed a couple minor issues that may come back to haunt us:

  1. There is a bit of an inconsistency in this api using a DataFrame input versus a MongoDB instance in the "_id" value that is special to MongoDB. I had to modify our tests to match a change I had to make to OriginTimeMatcher to make it work correctly. I am confident the change I made was the right one because it wouldn't work correctly with an id matching algorithm otherwise. Loading _id is the correct default because 99% of uses should load the source data from MongoDB. The only exception at this time is if one wanted to load a Datascope catalog. I modified the docstring to give that warning, but that is really deeply buried in the documentation.
  2. We probably should implement a "find_doc" method for the OriginTimeMatcher that this branch fixes. If we did that it could be used with the bulk_normalize function, but if you use it now that function raises a MsPASS exception and aborts saying it needs a "find_doc" method. The difference between find_one and find_doc are somewhat trivial. I should have probably implemented it but didn't think it was a particularly high priority so punted it down the road. If the group thinks that should be addressed now I can do that but it should probably done in separate branch as this branch addresses a critical bug fix - OriginTimeMatcher is completely broken until this is merged.