lanl / OpenFWI

A collection of codes with OpenFWI project
BSD 3-Clause "New" or "Revised" License
61 stars 10 forks source link

Input and normalization #5

Closed Itneedtime closed 6 months ago

Itneedtime commented 8 months ago

Hello author, may I ask whether the input network seismic waveform is a real two-dimensional matrix with specific values? I noticed the normalization here, is it for the global normalization or for each trace of the seismic waveform? If I enter a ground-penetrating radar B-scan, then it should also be the real two-dimensional matrix and not the pixels of the image, right? Thank you for your time.

hanchenwang commented 8 months ago

Hello,

The input seismic waveform to the network should be a globally normalized two-dimensional matrix. In the main code, we use the "T.minmax_normalization" function, which you can find in the "transforms.py", to do the global normalization of both input seismic data and ground truth labels.

If your input data is GPR data, the data should be globally normalized two-dimensional data as well. I am not sure what you mean by "the pixels of the image". If the input image is a 2D matrix, with each element in the matrix representing the value of that pixel, then it should be perfectly OK to feed the 2D image into our network, as long as the CNN parameters in "network.py" being modified to fit the size of your input image.

-Hanchen Wang

Itneedtime commented 8 months ago

Thank you very much for your reply.Yes, I want to enter the B-scan data of the actual GPR detection, not the image. I noticed that the values of the 2D matrix of all your seismic waveform samples are between "data_min" and "data_max", and the maximum and minimum values of each sample are basically the same as the other samples, which means that it is very convenient when dealing with the normalization of all samples. However, if the maximum and minimum values of each of my samples are different before normalization, for example, the "data_min" or "data_max" of one sample is very different from the "data_min" or "data_max" of another sample, then when I do normalization, in the json file, Will the "data_min" and "data_max" obtained for all samples affect other sample normalization? Or should I choose different "data_min" and "data_max" to normalize each sample individually?

hanchenwang commented 8 months ago

I totally understand your concern. Because our current datasets are synthetic simulations, the global "data_min" and "data_max" work well for all the samples in one dataset. In your case, the values differ a lot across different samples, making it a little bit complicated to do the normalization. Thus, from my experience, I would suggest the following trials.

1. Try global minmax_normalization first: If the values of different samples fall into similar ranges, for example, one sample ranges in [-5,5] and the other one ranges in [-10,10], you may still use the minmax_normalization function. The normalized data can still be recognized properly by the CNNs. 2. Normalization with mean and std: If the values fall into totally different ranges, for example, one in [-1,1] and the other in [-100,100], then minmax_normalization is not a proper one as the small range [-1,1] sample will vanish due to its nearly "zero" values. In this case, I would suggest using normalization with mean and std: (data - mean)/std, where "mean" and "std" should be global ones. 3. Local normalization: If normalization with mean and std does not work, the last method I can think of is sample-wise or even trace-wise normalization. These local normalization methods will destroy the amplitude information across different samples, which could possibly introduce noise and artifacts to the prediction. However, if the amplitude information is less important in your case, local normalization should be a good choice. 4. Zero-out bad traces: In some cases, the various value ranges mainly come from some anomalous traces. Zeroing out those traces before normalization may also help.

Hope my comments may help you to some extent.

-Hanchen Wang

Itneedtime commented 8 months ago

Many thanks for your suggestions.