jpvantassel / hvsrpy

A Python package for Horizontal-to-Vertical (H/V, HVSR) Spectral Ratio Processing.
https://pypi.org/project/hvsrpy/
Other
75 stars 30 forks source link

Supporting alternate data formats (i.e., not only miniSEED) #8

Closed jpvantassel closed 2 months ago

jpvantassel commented 4 years ago

Problem Summary

Researchers use various equipment to measure ambient noise for HVSR, and this variety of equipment unfortunately results in a variety of data formats. Ideally hvsrpy will allow for convenient handling of a variety of common data formats and not only miniSEED.

Proposed Solution

As most researchers are able to convert their data to ASCII/UTF-8 characters it makes sense to extend hvsrpy to include that functionality as a first step. However, as the format of any text file may vary, its difficult to produce a single script that will extract the data appropriately. Therefore, it is important to keep in mind that the example provided below is in fact only an example of a potential solution that can/must be modified appropriately.

Examples

For MiniShark:

# Load metadata
with open(fname, "r") as f:
    lines = f.readlines()
for line in lines:
    if line.startswith("#Sample rate (sps):"):
        _, sample_rate = line.split(":\t")
sample_rate = float(sample_rate)
dt = 1/sample_rate

# Load data
keys = ["vt", "ew", "ns"]
df = pd.read_csv(file_name, comment="#", sep="\t", names=keys)
components= {key:sigpropy.TimeSeries(df[key], dt) for key in keys}

# Create Sensor3c object to replace hvsrpy.Sensor3c.from_mseed() 
sensor = hvsrpy.Sensor3c(**components, meta={"File Name": file_name})

For SESAME ASCII data format (SAF) v1

fname = "MT_20211122_133110.SAF"

with open(fname, "r") as f:
    lines = f.readlines()

for idx, line in enumerate(lines):
    if line.startswith("SAMP_FREQ = "):
        fs = float(line[len("SAMP_FREQ = "):])
    if line.startswith("####--------"):
        idx += 1
        break

vt = []
ns = []
ew = []
for line in lines[idx:]:
    _vt, _ns, _ew = line.split()
    vt.append(_vt)
    ns.append(_ns)
    ew.append(_ew)

vt = sigpropy.TimeSeries(vt, dt=1/fs)
ns = sigpropy.TimeSeries(ns, dt=1/fs)
ew = sigpropy.TimeSeries(ew, dt=1/fs)

sensor = hvsrpy.Sensor3c(ns, ew, vt)
svishalguptacsir commented 4 years ago

Thank you very much, Joseph. hvsrpy is going to rock.

emirochica commented 2 years ago

Hi Joseph

I consider it important to be able to use hvsrpy with the SESAME ASCII data format (saf) v. 1. Thank you for your input.

jpvantassel commented 2 years ago

Hi @emirochica,

Can you post an example of the SESAME ASCII data format (saf) v1? I will then post an example parser script (similar to what I showed above for the MiniShark format) with the promise to provide native support for the format in the next release of hvsrpy.

ejchicaq-unal commented 2 years ago

Hi Joseph MT_20211122_133110.SAF.zip

I have attached a file in SAF format. Thanks for your support.

jpvantassel commented 2 years ago

Hi @emirochica,

See description above for a script to read SAF format and create a Sensor3c object. I will offer support for SAF in the next release of hvsrpy.

ejchicaq-unal commented 2 years ago

Hi @jpvantassel

Thank you very much for the time and effort that you dedicate to your project and that represents a valuable contribution to the work of other people.

While the solution of converting from SAF to miniseed in geopsy worked, editing the component names is extra work. The possibility of reading SAF directly on the notebook is much more straightforward.

RuijieAmy commented 8 months ago

Hi @jpvantassel

I'm attempting to utilize this Python script for processing HVSR data. However, I possess three Miniseed files for the X, Y, and Z components instead of one Miniseed file that includes them all. In the 'simple_hvsrpt_interdace.py' script, I can only input one file containing all X,Y.Z components. Do you have any suggestions for this scenario?

Thanks regard Amy

jpvantassel commented 8 months ago

Hi @RuijieAmy,

The from_mseed method has the option to provide fnames_1c rather than fname. See the details from the documentation below, this should do what you are looking for.

fnames_1c : dict, optional
    Some data acquisition systems supply three separate miniSEED
    files rather than a single combined file. To use those types
    of files, simply specify the three files in a `dict` of
    the form `{'e':'east.mseed', 'n':'north.mseed',
    'z':'vertical.mseed'}`, default is `None`.
luc4f commented 3 months ago

Hi @jpvantassel

I'm currently attempting to achieve the SAF to MSEED conversion in order to avoid the use of Geopsy. To do so, I tried using the example parser script you uploaded above. Unfortunately i keep getting an error I can't comprehend. I made an attempt also with the SAF file provided by @ejchicaq-unal above, obtaining almost the same issue. I attach the SAF file i used below and the error message screen.

Thank you very much, hvsrpy is an awesome resource Luca

ERR1 Colico_p1.zip

jpvantassel commented 3 months ago

Hi @luc4f,

The code snippet above reads a SAF file into a Sensor3C object. You can then use that object for performing HVSR computations by modifying the example notebooks. To be clear the script does not perform SAF to miniSEED conversion, but it will allow you to process your SAF data using hvsrpy. I looked at your file and it loads in correctly using the script above.

In addition, the new version (v2.0.0) of hvsrpy will be released in the next month. The new version has a major overhaul of many aspects including native support for different file types (seven in total; including SAF).

All the best, Joe

luc4f commented 3 months ago

Oh, now it works! still a newbie in this game :)

thanks regard @jpvantassel, Luca

jpvantassel commented 2 months ago

All,

With the public commit #d27f92ab, hvsrpy now support all major file types for microtremor and earthquake recordings. These include: miniSEED (1 and 3 file versions), SAC, MiniShark, SAF, PEER, GCF. A table summarizing these formats in great detail is provided below. These advancements are currently only available by installing via GitHub, but will be available for the community via pip by the end of the week (July 12th 2024).

image