Closed pavlis closed 1 year ago
A not so small postscript to the previous. In trying to get graphics to work I came to realize we had some additional unresolved issues with graphics.py. There is a sizing problem that happens somehow with the imageplot version. Whoever gets assigned the task of fixing the above might want to look for other issues in graphcs.py. There are two I know of:
def _wtva_SeismogramEnsemble(self, d, fill):
# implement by call to ExtractComponent and calling TimeSeriesEnsemble method 3 times
# should return a list of 3 gcf handles
figure_handles = []
for k in range(3):
dcomp = ExtractComponent(d, k)
# figure_title='Component %d' % k
# pyplot.figure(figure_title)
pyplot.figure(k)
handle = self._wtva(dcomp)
figure_handles.append(handle)
# pyplot.show()
return figure_handles
When that private method is called internally, however, it is handled this way:
if isinstance(d, SeismogramEnsemble):
if len(d.member) <= 0:
raise IndexError(base_error + "ensemble container is empty")
self._wtva_SeismogramEnsemble(d, fill)
self._add_3C_titles()
which means the handles are just dropped.
The fundamental issue is when I wrote this code I had planned ahead by trying to have the code return the plot handles. That could and should be a simple way to write an extension class of SeismicPlotter
that had enhanced features. I obviously never tested that I did all that right. The basic message is we need to have both the classes in this file updated to consistently handle returns of pyplot handles from any plot.
Let me add to the comment above about changing the name Ensemble
in the C++ code base everywhere to CoreEnsemble
. There is a bit more to this than just one name change.
LoggingEnsemble
to Ensemble
. Then an Ensemble
relative to CoreEnsemble
is like TimeSeries
is to CoreTimeSeries
.LoggingEnsemble
. Implementation of the live/dead concept should be in the base class for consistency with TimeSeries and Seismogram. There is a residual, much thornier issue that would benefit from discussion before we proceed: ProcessingHistory
. The TimeSeries
and Seismogram
objects inherit a ProcessingHistory
object (the assumption is a TimeSeries, for example, "is a" ProcessingHistory" instead "has a" ProcessingHistory. That is particularly problematic because most people would say "has a" is a better answer than "is a". I take responsibility for that design decision because back about 2 years ago I had some issue I couldn't resolve doing that and decided to define it as "is a". In retrospect that was a bad idea, but in my view it would be way way to hard to back that out now as it would break a lot of things.
What this raises is if we should add a ProcessingHistory
to the data formerly known as LoggingEnsemble
for consistency with the atomic data? If the answer is yes we'd have to address the "is a" or "has a" question again, but I assert we should not include PrecessingHistory
in the revised Ensemble==LoggingEnsemble
. There is a fundamental reason: An ensemble will break the constraints in the design of ProcessingHistory. The reason is deeply in the weeds of the implementation. A simple summary is that the ProcessingHistory container assumes a tree structure that is limited to map-reduce. That is, the nodes of the tree are defined by a processing step and the tree is inverted meaning it begins with the leaves and terminates at the root. If an ensemble appeared in the middle of a chain the tree could grow both ways and I am pretty sure the implementation will not handle that situation. Hence, I think we should continue to take the position that only Seismogram and TimeSeries are atomic and ensembles are a different thing. Note, this issue will carry over to our under construction ArrayEnsemble (or whatever we call it too. I personally don't think the interest or need for processing history as a concept is worth the investment it would take to retrofit the entire code base to handle ensembles. @wangyinz please rule on this question for the record.
If you accept my plan above I suggest we address this issue in two steps:
TimeSeriesEnsemble
and SeismogramEnsemble
100% backward compatible with existing python code. This secondary copy is to convert the return type from mspasspy.ccore.seismic.CoreTimeSeriesEnsemble
to TimeSeriesEnsemble
. If there is no copy, _wtva()
function in graphics.py (line 826) will raise RuntimeError
because the type is not correct. _wtva()
function does not support mspasspy.ccore.seismic.CoreTimeSeriesEnsemble
type, it only supports TimeSeriesEnsemble
type.
d = TimeSeriesEnsemble(bsc.EnsembleComponent(data, component))
I'm afraid issue #332 was closed prematurely. Well, the main issue was resolved but the original reason I reported the issue was not resolved. Our graphics tutorial is broken for the section discussing SeismogramEnsemble plots.
The problem was we didn't close on an inconsistency between what I had locally to resolve the issue and what was actually implemented. In our C++ code base is the a function called
EnsembleComponent
. It behaves exactly the same as a related function calledExtractComponent
in ccore. Both extract a single component of data from 3C data. The difference is one is for ensembles and the other atomic Seismogram objects. #332 resolved this problem by creating a python wrapper calledExtractComponent
that handles type mistmatches the pythonic way. It uses a set of isinstance calls to deal with strong typing inputs from ccore.The first order problem that happened is we didn't fix graphics.py to match this change. The current graphics.py module has the following import lines:
Those lines resolve but the first line should be this:
That, however, leaves unresolved references to EnsembleComponent. There are six in the graphics.py and they can be fixed in 30 s with a global change all EnsembleComponent to ExtractComponent.
We need apply that patch asap as the graphics tutorial is broken until we fix it. The way the fix can be tested is simple: just run the graphics tutorial and see if it works.
Had I known how to do so I'd have just edited that file on github to make that fix, but think preserving this for the record is important anyway. The reason is that I think there is some additional cleanup we should do with this.
A tougher decision is what to do about the function now called EnsembleComponent in ccore? Right now the wrapper code for ExtractEnsemble to handle this is a little weird and hence confusing:
The confusing of using the bsc qualifier for both Seismogram and SeismogramEnsemble sections could be eliminated if we first fix the pybind 11 wrappers as noted above. The weirder construct is:
That line, at least, screams for a comment to explain why the secondary copy is needed. That should probably be done, by the way, for the initial patch for this problem. The BROADER PROBLEM this illustrates, however, is something discussed previously in #332. We need to clean up the handling of Ensembles. The fundamental issue is during the evolution of ensembles we extended the original ccore
Ensemble
templates to define a a set ofLoggingEnsemble
templates. The later addedelog
. For consistency with our atomic data objects we really should go through the C++ code base and change a occurences ofEnsemble
toCoreEnsemble
. That change could be isolated in the C++ code by having the pybind11 code handleCoreEnsemble
in the same way we now handleCoreTimeSeries
andCoreSeismogram
. That is, those symbols are treated as private with appropriate use of "_" to clarify the namespace (i.e. in python the a C++ CoreTimeSeries would only be visible as _CoreTimeSeries and a Seismogram would only be visible as _CoreSeismogram).The second part of this proposed fix is not critical, BUT it is better to do it now than later. The longer we delay the more confusion it will create. It will definitely create a long-term maintenance issue if we don't deal with it.