Our merge function found in mspasspy/algorithms/window has a somewhat unnecessary restriction we can and probably should fix. The only reason I don't just do it is is that there are multiple ways to fix the same problem.
The problem is that merge currently insists that it get a DoubleVector of TimeSeries objects AND the vector MUST be in starttime order. Currently it returns a dead datum if that condition isn't satisfied. That seems a bit unnecessarily pedantic for a problem easily fixed with a couple of lines of code to fix the input error. I think I wrote this code originally and I can't understand why I put that restriction in there in the first place.
The problem is easily addressed. There are at least 3 ways I can think of to solve it.
My suggestion is to handle it in the C++ function that does the main work of this function behind the python wrapper merge. The C++ algorithm::sort function will do that and do it about as fast as possible with a custom sort function using the TimeSeries attribute t0.
I am pretty certain you could do the same thing with a python sort function of which I am sure there are many options. Unclear what, if any, the performance hit would be.
We could just ignore it and put a description in the docstring including a code example I'll show below on how to handle that with database input.
For the record, 3 is easy to do with the new Database implementation. Here is an example:
I think 3 is a terrible solution for a problem this easy to fix. I put it in the record more as a useful trick for anyone doing this to avoid the overhead of having the code do the sorting when it detects this error. In general, if one were reading continuous data to run merge db queries limiting the time range (not show above for simplicity) and limiting the result to a common channel as above should be recommended. Note also, by the way, with the upcoming v2 of Database that would be very parallelizable with the new read_distributed_data being driven by a list of db queries dict's
Our
merge
function found in mspasspy/algorithms/window has a somewhat unnecessary restriction we can and probably should fix. The only reason I don't just do it is is that there are multiple ways to fix the same problem.The problem is that
merge
currently insists that it get aDoubleVector
ofTimeSeries
objects AND the vector MUST be instarttime
order. Currently it returns a dead datum if that condition isn't satisfied. That seems a bit unnecessarily pedantic for a problem easily fixed with a couple of lines of code to fix the input error. I think I wrote this code originally and I can't understand why I put that restriction in there in the first place.The problem is easily addressed. There are at least 3 ways I can think of to solve it.
merge
. The C++ algorithm::sort function will do that and do it about as fast as possible with a custom sort function using theTimeSeries
attribute t0.For the record, 3 is easy to do with the new
Database
implementation. Here is an example:I think 3 is a terrible solution for a problem this easy to fix. I put it in the record more as a useful trick for anyone doing this to avoid the overhead of having the code do the sorting when it detects this error. In general, if one were reading continuous data to run merge db queries limiting the time range (not show above for simplicity) and limiting the result to a common channel as above should be recommended. Note also, by the way, with the upcoming v2 of Database that would be very parallelizable with the new
read_distributed_data
being driven by a list of db queries dict's