jpjones76 / SeisIO.jl

Julia language support for geophysical time series data
http://seisio.readthedocs.org
Other
47 stars 21 forks source link

Merging SeisIO.jl and Seis.jl #5

Closed mdenolle closed 5 years ago

mdenolle commented 5 years ago

Hello JP Jones. We are a team trying to build similar tools for ambient noise correlation work. With Andy Nowacki (https://github.com/anowacki) we are trying to build a module that is similar to Obspy but for Julia. Both his Seis and SeisRequests and your SeisIO modules have functionalities that we are interested in, namely:

Is there any chance we could chat with you about importing some of the functionality from your module into Seis.jl (https://github.com/anowacki/Seis.jl)? happy to make one module that accommodates all needs. We are also developing HDF5/ASDF formats.

Cheers, Marine

jpjones76 commented 5 years ago

Hi Marine,

I like the idea, but I need more information about the scope and goals of Seis.jl, e.g.,

As a brief background, when I created SeisIO in May 2016, there was no seismic data architecture for Julia; there was SacIO.jl by Ben Postlethwaite for SAC, plus a SEG Y reader in the U. Alberta reflection seismology module. My goal was (and is) a streamlined data architecture with a simple file format: speed and intuitive UX, with the ability to read as many geophysical file formats as possible. So, before I could agree to anything, I'd need to know what parts of SeisIO will be used and how.

Thanks, Josh

mdenolle commented 5 years ago

Hi Josh,

I would like to pursue this conversation via email, would you be able to find my address from my work affiliation? Cheers, Marine

jpjones76 commented 5 years ago

Email sent! Let's talk more there.

anowacki commented 5 years ago

Hi both,

Apologies for not noticing this earlier, and sorry if this came out of the blue. I had been slowly chipping away at Seis.jl before actively engaging with the wider seismic Julian world, and then Marine got in touch. I had not really intended to start using Seis.jl in anger and still feel much is up for grabs. It's an attempt to learn the lessons from SAC.jl and take the opportunity to change it for Julia v1.0. For my day-to-day processing, I'm in the process of moving from SAC.jl to Seis.jl, but still want to see whether the community can coalesce around something for the long term.

(SAC.jl has been going since 2015 and is in use at least at MIT and UNAM (Mexico), the latter for teaching.)

My end goal is to get to a community-supported Julia passive seismic package that does what we need well (see here). I'm not totally wedded to any one design, but with Marine's work we have an opportunity to try to get to something from real-world experience which my own research doesn't address (i.e., noise and very long time series).

To answer your questions:

I would be very happy to continue this conversation offline as well.

jpjones76 commented 5 years ago

Hi Andy,

I'm very sorry for stepping on your toes about SAC functionality. I'm willing to rewrite SeisIO to use SAC.jl as a dependency, but I'd like to do some benchmark tests first.

My goals for SeisIO are very basic:

The last item is why I have very few fields in each data type; most users don't need the fine-grained field definitions of e.g., SEG Y, and for non-seismic data the field abbreviations quickly become ambiguous.

I'd be happy to use SeisIO as the basis of other packages, though, and can make modifications provided that the core data types remain backwards-compatible. I've thought about a major overhaul to optimize the "data" field (S.x) for distributed computing, but the data type needed is a turgid union of arrays or DistributedArrays.jl, which isn't yet stable.

Honestly, I didn't know that SAC.jl existed in 2016, or I would've coded with it as a dependency. When I started SeisIO, I searched for Julian seismology packages and found only two: SacIO by @bpostlethwaite and one from a U. Alberta group. SacIO was abandoned 2013 but the creator gave me permission to take over via. email. I think I briefly used his code, but switched because Julia improved string handling.

Your StationXML.jl looks excellent. I think my web requests do similar things to SeisRequests aside from that, with a working SeedLink client. In theory I could incorporate StationXML.jl as a dependency, but FDSN StationXML is station-oriented, rather than channel-oriented.

I have concerns about creating a single large ObsPy-type project with many contributors, but that's best discussed by email.

anowacki commented 5 years ago

Hi Josh,

No need to apologise—I think there's plenty of room for multiple Julia packages dealing with SAC data. I'm certain that SAC.jl isn't optimised for your uses, and I think SeisIO does its job very nicely without needing to depend on SAC.jl—it's such a basic format after all. (Interestingly in my speed profiling, SAC.jl—and so will SeisIO I guess—outperforms SAC on reading and writing SAC data!) In any case, I plan to deprecate SAC.jl once Seis.jl is up and running. So not using SAC.jl back in 2016 was probably the right decision in any case!

I don't think merging efforts is necessarily the way to go here. SeisIO could be a great base for IO within a larger package, I guess, and I see exactly your goals with it. I think there is a general appetite out there for a bells-and-whistles package/ecosystem (look at the success of ObsPy) so I'm aiming more in that direction. So the goals of the two packages are obviously different, and that's fine in my opinion.

One future option I've been thinking about is to have a SeisBase.jl package which just defines the structures, and on top of that is built the rest of the ecosystem. Then your suggestion of updating the fields within SeisIO might come into play. The initial design of Trace is Seis.jl is meant to be similar, but I let you chuck in anything you like in a .meta dictionary field. But we're straying far from SeisIO in a SeisIO issue, so I think it's best to continue this elsewhere, like you say.

I can't find your email address, but mine is on my GitHub user page.

jpjones76 commented 5 years ago

I'm going to close this issue since we all have each other's email addresses now. Thanks again and let's continue the discussion there!