Closed lj-cug closed 11 months ago
Hey @lj-cug
Looks like you are using Python scripts to run PyGIMLI, so in this case I'll recommend you check the section on how to use the lib with Python here. Please, click on Detailed examples
section to show snippets.
Regarding your question - there is no simple method or way to do it. But I can suggest the following snippet to generate such a file:
from pathlib import Path
from typing import Optional, Union, Sequence, Tuple
import numpy as np
import pandas as pd
from first_breaks.sgy.reader import SGY
def prepare_pygimli_file(sgy: SGY,
picks: Union[np.ndarray, Sequence[float]],
filename: Union[Path, str],
add_file_header: bool = True,
precision: int = 8,
s_x: Optional[Union[str, np.ndarray, Sequence[float]]] = "SOU_X",
s_y: Optional[Union[str, np.ndarray, Sequence[float]]] = "SOU_Y",
s_z: Optional[Union[str, np.ndarray, Sequence[float]]] = "SOU_ELEV",
g_x: Optional[Union[str, np.ndarray, Sequence[float]]] = "REC_X",
g_y: Optional[Union[str, np.ndarray, Sequence[float]]] = "REC_Y",
g_z: Optional[Union[str, np.ndarray, Sequence[float]]] = "REC_ELEV",
columns_order: Tuple[str, ...] = ("s_x", "s_y", "g_x", "g_y", "t")):
# we assume that amount of traces in file and amount of the picks is the same
assert sgy.num_traces == len(picks)
data_to_dump = {}
for file_column, value in [
["s_x", s_x],
["s_y", s_y],
["s_z", s_z],
["g_x", g_x],
["g_y", g_y],
["g_z", g_z],
["t", picks]
]:
if s_x is None:
continue
elif isinstance(value, str): # read value from file headers
data_to_dump[file_column] = sgy.traces_headers[value].tolist()
elif isinstance(value, (list, tuple, np.ndarray)): # you can provide custom iterable instead of header values
assert len(value) == sgy.num_traces
if isinstance(value, np.ndarray):
value = value.tolist()
data_to_dump[file_column] = value
else:
raise TypeError(f"Unsupported type {type(value)} for {file_column} column")
assert set(columns_order).intersection(set(data_to_dump.keys())) == set(columns_order)
Path(filename).parent.mkdir(parents=True, exist_ok=True)
df = pd.DataFrame({k: data_to_dump[k] for k in columns_order})
df.to_csv(filename, sep='\t', index=False, float_format=f'%.{precision}f', header=add_file_header)
Then you can do the following alternative scenarios:
1) Get picks with python + save pygimli file
from first_breaks.picking.picker_onnx import PickerONNX
from first_breaks.picking.task import Task
from first_breaks.sgy.reader import SGY
sgy_filename = 'data.sgy'
sgy_to_pick = SGY(sgy_filename)
task = Task(sgy_to_pick,
traces_per_gather=24,
maximum_time=200)
picker = PickerONNX(show_progressbar=True)
task = picker.process_task(task)
picks_in_mcs = task.picks_in_mcs
prepare_pygimli_file(sgy_to_pick, picks=picks_in_mcs, filename="pygimli_dump.txt", precision=10)
2) Get picks with python + save new SGY updated with first breaks + save pygimli file
from first_breaks.picking.picker_onnx import PickerONNX
from first_breaks.picking.task import Task
from first_breaks.sgy.reader import SGY
sgy_filename = 'data.sgy'
sgy_to_pick = SGY(sgy_filename)
task = Task(sgy_to_pick,
traces_per_gather=24,
maximum_time=200)
picker = PickerONNX(show_progressbar=True)
task = picker.process_task(task)
# save sgy updated with picks in header
sgy_filename_picks = "data_picked.sgy"
encoding = "i"
byte_position = 236
picks_unit = "mcs"
task.export_result_as_sgy(filename=sgy_filename_picks, encoding=encoding, picks_unit=picks_unit, byte_position=byte_position)
# read updated sgy and get values from header
sgy_with_picks = SGY(sgy_filename_picks)
picks_in_mcs = sgy_with_picks.read_custom_trace_header(byte_position=byte_position, encoding=encoding)
prepare_pygimli_file(sgy_with_picks, picks=picks_in_mcs, filename="pygimli_dump.txt", precision=10)
3) Get picks with Desktop app and save it to file + save pygimli file
# we saved this file with Desktop app. Put encoding and byte_position value, which you selected during saving
sgy_filename_picks = FILE_DESKTOP_SAVED
encoding = ENCODING_DESKTOP_SAVED
byte_position = BYTE_POSITION_DESKTOP_SAVED
# read updated sgy and get values from header
sgy_with_picks = SGY(sgy_filename_picks)
picks_in_mcs = sgy_with_picks.read_custom_trace_header(byte_position=byte_position, encoding=encoding)
prepare_pygimli_file(sgy_with_picks, picks=picks_in_mcs, filename="pygimli_dump.txt", precision=10)
Please let me know if it helps or not!
Also, adjust the units for position and travel time according to your needs.
Thank you very much for your codes. I have tried to run and obtained the pygimli_dump.txt file. Because my segy file doesn't contain shot and geophone position information, I have to supplement these coordinates into the PyGIMLI input file. Today I study the **Unified Data Format" file of PyGIMLI following https://github.com/gimli-org/example-data/issues/2 The contents like this: It first describes the sensor (both shot and geophone for traveltime) positions and then the data. Assume a single shot with 4 receivers, each spaced 5m with an offset of 10m: 5 # 1 shot and 4 receivers // x z -10 0 # shot position 0 0 # the following 4 lines are geophone position 5 0 10 0 15 0 // s g t 1 2 0.2 # shot index, geophone index, travel-time (s) 1 3 0.3 1 4 0.5 1 5 0.6 I am thinking how to output these data based on your supplied Python scripts. If you could help me, it's better.
This is the shot and geophone position and final inversion result figure.
5 # 1 shot and 4 receivers // x z -10 0 # shot position 0 0 # the following 4 lines are geophone position 5 0 10 0 15 0 // s g t 1 2 0.2 # shot index, geophone index, travel-time (s) 1 3 0.3 1 4 0.5 1 5 0.6 I am thinking how to output these data based on your supplied Python scripts. If you could help me, it's better.
What should the file look like for multiple shots?
Oh, I see.
x y
describes both sources and receivers. It just identifies some spatial points.
s g
reuses indices of spatial points.
So we can describe exploration with several shots like:
x y
0 11
0 12
0 13
0 14
0 15
s g t
1 5 5
2 5 4
3 5 3
4 5 2
5 5 0
Thank you for data sharing. But what is the relation between files with positions and traces in the file? Because they have different amounts.
@lj-cug is it relevant?
These days I checked the data in the shot_position and geophone_position data files. I found the data format, which shows the shot and the geophone positions are relevant. Actually 78 shots were conducted and the data in geophone_position file are the first geophone in vertical dimension, the interval of geophone points is 1 meter (total 12 geophones for every shot). So I can prepare the pyGIMLI input files now. But the shot and geophone points layout in different geophysical survey are different, so understanding the layout is important. Thank you for your help. The automatic first arrival picking tool can save me much time! But in actual picking job, it's better to adjust the picking point. Li Jian
I'm glad you were able to figure out how to prepare the file!
But in actual picking job, it's better to adjust the picking point.
Yeah, that's true. But sometimes it's enough to adjust picking parameters.
Could you share your picking parameters? I found such parameters relatively good
Actually, I use the default parameters used in the example code. My picking shows as following:
Yep, I tried the default parameters, but it looks random 😁 common observation - you should choose parameters at which it would be convenient for you to pick manually. In this case, the neural network should give a good result, because it was traines in such setup
It's a good tool to pick the first arrival time of seismic data. Thank you for your contribution. The input file for PyGIMLI need the first-arrival-time file, which contains three columns data: shot position, geophone position, travel-time. How can I obtain this formatted file from first_breaks_picking tools? If I can get this file, the workflow for tomography and full-waveform inversion can continue. Yours Sincerely, Li Jian China University of Geosciences