computational-seismology / pypaw

PYthon Process Asdf Wokflow, in short for, pypaw
https://computational-seismology.github.io/pypaw/
GNU Lesser General Public License v3.0
3 stars 6 forks source link

suggestions for making data processing tools easier to use #18

Open rmodrak opened 6 years ago

rmodrak commented 6 years ago

1) there appears to be a higher level of indirection within pypaw than necessary; consider removing the bins/ directory by merging it with the examples/ directory; the highest level contents of examples/ could be a set of (mpi invokable) scripts rather than a set of subdirectories

2) ASDF is needed for parallel I/O but not for parallel signal processing; the machinery for parallel signal processing could be made to work on any given obspy stream object; move this machinery from pyasdf to pytomo3d, or at the very least separate the parallel apply_function code in pyasdf from the parallel_io code; having talked about it with Lion, I believe this change is not just possible, but could be a major improvement

3) the fundamental differences between ASDFDataSets and obspy stream objects have to do with parallel I/O and metadata organization; obspy stream objects read from SAC, SU, or SEGY files should have much of the same metadata, just organized differently; it should be possible to provide a level of file format independence by maintaining two sets of data processing scripts, one that works on data read from ASDFDataSets and another that works on obspy stream objects, with substantial overlap between the two sets, e.g.

process_traces.py
process_traces_asdf.py
generate_adjoint_traces.py
generate_adjoint_traces_asdf.py

4) I’ve said it before, but I’m not sure there is a fundamental reason to maintain separate pytomo3d and pypaw packages (though I see how reversible design decisions or expediency could lead to this outcome)

5) within pytomo3d or pypaw, provide an option to accept arbitrary ‘weights’, but consider moving any calculate_geographic_weights functions if present outside the main package

wjlei1990 commented 6 years ago

Could you split it into several separate issues? So it might be in the future more easy to track...Or label them with number? So we are clearly know which one we are responding to.

rmodrak commented 6 years ago

I've changed the bullets to numbers. Rather than immediate action points, these are general ideas to make the package user friendly, so maybe it's just as well to leave them in as a single "brainstorming" issue. If you did at some point decide to implement one of the numbered points, feel free to open a new issue to allow more focused discussion. Does this sound alright?